Advertisement
Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- from nltk.tokenize import wordpunct_tokenize
- doc_words2 = [wordpunct_tokenize(docs[fileid]) for fileid in fileids]
- print('\n-----\n'.join(wordpunct_tokenize(docs[1][0])))
- OUTPUT:
- Good
- -----
- morning
- -----
- .
- -----
- How
- -----
- are
- -----
- you
- -----
- ?
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement