nltk.HunposTagger.__init__

HunposTagger.__init__(path_to_model, path_to_bin=None, encoding='ISO-8859-1', verbose=False)[source]

Starts the hunpos-tag executable and establishes a connection with it.

Parameters:
  • path_to_model – The model file.
  • path_to_bin – The hunpos-tag binary.
  • encoding

    The encoding used by the model. Unicode tokens passed to the tag() and tag_sents() methods are converted to this charset when they are sent to hunpos-tag. The default is ISO-8859-1 (Latin-1).

    This parameter is ignored for str tokens, which are sent as-is. The caller must ensure that tokens are encoded in the right charset.