Guest User

Untitled

a guest
Jul 16th, 2018
66
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 0.42 KB | None | 0 0
  1. from sklearn.feature_extraction.text import CountVectorizer
  2.  
  3. # list of text documents
  4. text = ["this is test doc", "this is another test doc"]
  5.  
  6. # create the transform
  7. vector = CountVectorizer()
  8.  
  9. # tokenize and build vocab
  10. vector.fit(text)
  11.  
  12. # Print the summary
  13. print(vectorizer.vocabulary_)
  14.  
  15. # Transform document
  16. X_Train = vector.transform(text)
  17.  
  18. # Print summary of transformed vector
  19. print(X_Train.shape)
  20. print(type(X_Train))
Add Comment
Please, Sign In to add comment