`nltk.tokenize.word_tokenize()`¶

nltk.tokenize.word_tokenize(text, language='english')[source]¶

Return a tokenized copy of text, using NLTK’s recommended word tokenizer (currently TreebankWordTokenizer along with PunktSentenceTokenizer for the specified language).

Parameters:	text – text to split into sentences language – the model name in the Punkt corpus

nltk.tokenize.word_tokenize()¶