nltk.TabTokenizer

class nltk.TabTokenizer[source]

Tokenize a string use the tab character as a delimiter, the same as s.split('\t').

>>> from nltk.tokenize import TabTokenizer
>>> TabTokenizer().tokenize('a\tb c\n\t d')
['a', 'b c\n', ' d']

Methods

span_tokenize(s)
span_tokenize_sents(strings) Apply self.span_tokenize() to each element of strings.
tokenize(s)
tokenize_sents(strings) Apply self.tokenize() to each element of strings.