Advertisement
jbozhich

WordVector

Dec 6th, 2017
87
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
Python 0.50 KB | None | 0 0
  1. import argparse
  2. from nltk.stem import snowball
  3. from nltk import pos_tag, word_tokenize
  4. from nltk.corpus import wordnet
  5.  
  6.  
  7. parser = argparse.ArgumentParser()
  8. parser.add_argument("file")
  9.  
  10. options = parser.parse_args()
  11.  
  12. with open(options.file, 'r') as f:
  13.         gum_text = f.read()
  14.  
  15.  
  16. gum_docs = gum_text.split("\n")
  17.  
  18.  
  19. frequencies = []
  20.  
  21. for document in gum_docs:
  22.     tokenized = word_tokenize(document)
  23.     tagged = pos_tag(tokenized)
  24.     frequencies.append(tagged)
  25. print(frequencies[0][0])
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement