nltk.guess_encoding()
¶
-
nltk.
guess_encoding
(data)[source]¶ Given a byte string, attempt to decode it. Tries the standard ‘UTF8’ and ‘latin-1’ encodings, Plus several gathered from locale information.
The calling program must first call:
locale.setlocale(locale.LC_ALL, '')
If successful it returns
(decoded_unicode, successful_encoding)
. If unsuccessful it raises aUnicodeError
.