nltk.SpaceTokenizer

class nltk.SpaceTokenizer[source]

Tokenize a string using the space character as a delimiter, which is the same as s.split(' ').

>>> from nltk.tokenize import SpaceTokenizer
>>> s = "Good muffins cost $3.88\nin New York.  Please buy me\ntwo of them.\n\nThanks."
>>> SpaceTokenizer().tokenize(s)
['Good', 'muffins', 'cost', '$3.88\nin', 'New', 'York.', '',
'Please', 'buy', 'me\ntwo', 'of', 'them.\n\nThanks.']

Methods

span_tokenize(s)
span_tokenize_sents(strings) Apply self.span_tokenize() to each element of strings.
tokenize(s)
tokenize_sents(strings) Apply self.tokenize() to each element of strings.