bs4.UnicodeDammit¶
-
class
bs4.
UnicodeDammit
(markup, override_encodings=[], smart_quotes_to=None, is_html=False, exclude_encodings=[])[source]¶ A class for detecting the encoding of a *ML document and converting it to a Unicode string. If the source encoding is windows-1252, can replace MS smart quotes with their HTML or XML equivalents.
Methods¶
__init__ (markup[, override_encodings, ...]) |
|
detwingle (in_bytes[, main_encoding, ...]) |
Fix characters from one encoding embedded in some other encoding. |
find_codec (charset) |