Advertisement
Guest User

Untitled

a guest
Sep 16th, 2019
99
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 0.39 KB | None | 0 0
  1. import nltk
  2. from nltk.tokenize import word_tokenize
  3. from nltk.text import Text
  4.  
  5. nltk.download('punkt')
  6. string ="It is the branch of data science that consists of systematic processes for analyzing, understanding, and how to driving information from the text data in a smart and efficient manner."
  7.  
  8. tokens = word_tokenize(string)
  9. print(tokens)
  10.  
  11. tokens = [word.lower() for word in tokens]
  12. tokens[:5]
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement