nltk.TokenSearcher

class nltk.TokenSearcher(tokens)[source]

A class that makes it easier to use regular expressions to search over tokenized strings. The tokenized string is converted to a string where tokens are marked with angle brackets – e.g., '<the><window><is><still><open>'. The regular expression passed to the findall() method is modified to treat angle brackets as non-capturing parentheses, in addition to matching the token boundaries; and to have '.' not match the angle brackets.

Methods

__init__(tokens)
findall(regexp) Find instances of the regular expression in the text.