nltk.pos_tag()
¶
-
nltk.
pos_tag
(tokens, tagset=None)[source]¶ Use NLTK’s currently recommended part of speech tagger to tag the given list of tokens.
>>> from nltk.tag import pos_tag >>> from nltk.tokenize import word_tokenize >>> pos_tag(word_tokenize("John's big idea isn't all that bad.")) [('John', 'NNP'), ("'s", 'POS'), ('big', 'JJ'), ('idea', 'NN'), ('is', 'VBZ'), ("n't", 'RB'), ('all', 'PDT'), ('that', 'DT'), ('bad', 'JJ'), ('.', '.')] >>> pos_tag(word_tokenize("John's big idea isn't all that bad."), tagset='universal') [('John', 'NOUN'), ("'s", 'PRT'), ('big', 'ADJ'), ('idea', 'NOUN'), ('is', 'VERB'), ("n't", 'ADV'), ('all', 'DET'), ('that', 'DET'), ('bad', 'ADJ'), ('.', '.')]
NB. Use pos_tag_sents() for efficient tagging of more than one sentence.
Parameters: - tokens (list(str)) – Sequence of tokens to be tagged
- tagset (str) – the tagset to be used, e.g. universal, wsj, brown
Returns: The tagged tokens
Return type: list(tuple(str, str))