Python 2.7 - convertir BeautifulSoup NavigableString d'utiliser les entités html

jeudi 15 mai 2014

Python 2.7 - convertir BeautifulSoup NavigableString d'utiliser les entités html - Stack Overflow

So BeautifulSoup parses HTML entities into Unicode upon reading input. I can convert these back to HTML entities if I use .prettify(formatter='html') on an HTML element.

But the NavigableString class doesn't have a .prettify() method. I want to turn a NavigableString into a string containing the proper HTML entities. How can I do this?

The only way I can think of is surround it with a fake <a> tag, use .prettify() on the tag, and strip out the beginning and ending characters from the resulting string.

Had to resort to UTSL. The corresponding method is NavigableString.output_ready().

>>> u=BeautifulSoup('&alpha;')
>>> u
<html><body><p>╬▒</p></body></html>
>>> u.p.contents[0]
u'\u03b1'
>>> u.p.contents[0].output_ready(formatter='html')
u'&alpha;'

So BeautifulSoup parses HTML entities into Unicode upon reading input. I can convert these back to HTML entities if I use .prettify(formatter='html') on an HTML element.

But the NavigableString class doesn't have a .prettify() method. I want to turn a NavigableString into a string containing the proper HTML entities. How can I do this?

The only way I can think of is surround it with a fake <a> tag, use .prettify() on the tag, and strip out the beginning and ending characters from the resulting string.

Had to resort to UTSL. The corresponding method is NavigableString.output_ready().

>>> u=BeautifulSoup('&alpha;')
>>> u
<html><body><p>╬▒</p></body></html>
>>> u.p.contents[0]
u'\u03b1'
>>> u.p.contents[0].output_ready(formatter='html')
u'&alpha;'

Source

Stackoverflow Blog

jeudi 15 mai 2014

Python 2.7 - convertir BeautifulSoup NavigableString d'utiliser les entités html - Stack Overflow

0 commentaires:

Enregistrer un commentaire

Popular Posts