bs4.UnicodeDammit.detwingle¶
-
classmethod
UnicodeDammit.
detwingle
(in_bytes, main_encoding='utf8', embedded_encoding='windows-1252')[source]¶ Fix characters from one encoding embedded in some other encoding.
Currently the only situation supported is Windows-1252 (or its subset ISO-8859-1), embedded in UTF-8.
The input must be a bytestring. If you’ve already converted the document to Unicode, you’re too late.
The output is a bytestring in which embedded_encoding characters have been converted to their main_encoding equivalents.