So BeautifulSoup parses HTML entities into Unicode upon reading input. I can convert these back to HTML entities if I use .prettify(formatter='html')
on an HTML element.
But the NavigableString
class doesn't have a .prettify()
method. I want to turn a NavigableString
into a string containing the proper HTML entities. How can I do this?
The only way I can think of is surround it with a fake <a>
tag, use .prettify()
on the tag, and strip out the beginning and ending characters from the resulting string.
Had to resort to UTSL. The corresponding method is NavigableString.output_ready()
.
>>> u=BeautifulSoup('α')
>>> u
<html><body><p>╬▒</p></body></html>
>>> u.p.contents[0]
u'\u03b1'
>>> u.p.contents[0].output_ready(formatter='html')
u'α'
So BeautifulSoup parses HTML entities into Unicode upon reading input. I can convert these back to HTML entities if I use .prettify(formatter='html')
on an HTML element.
But the NavigableString
class doesn't have a .prettify()
method. I want to turn a NavigableString
into a string containing the proper HTML entities. How can I do this?
The only way I can think of is surround it with a fake <a>
tag, use .prettify()
on the tag, and strip out the beginning and ending characters from the resulting string.
Had to resort to UTSL. The corresponding method is NavigableString.output_ready()
.
>>> u=BeautifulSoup('α')
>>> u
<html><body><p>╬▒</p></body></html>
>>> u.p.contents[0]
u'\u03b1'
>>> u.p.contents[0].output_ready(formatter='html')
u'α'
0 commentaires:
Enregistrer un commentaire