Guest User

Untitled

a guest
Mar 20th, 2018
77
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 0.59 KB | None | 0 0
  1. # n-tokenization for python sentences
  2. #open file - attached with the gist
  3. words=open('thought.txt','r')
  4.  
  5. #get the content in an str
  6. words_imp=words.read()
  7.  
  8. #close file.
  9. words.close()
  10.  
  11. #create the content into a list
  12. words_imp=words_imp.split(' ')
  13.  
  14. #decide word span for tokens : n
  15. n=3
  16.  
  17. #make sublists of 3 words each
  18. i = [words_imp[i:i+n] for i in range(0,len(words_imp),n)]
  19.  
  20. # i is a list of lists containing 3 words each from the fetched content
  21.  
  22. #f is a list of strings
  23. f=[' '.join(a) for a in i]
  24.  
  25. #result is like
  26. # f= ['i think that', 'the whole intuition', 'of the wokrplace', 'owns you is'........]
Add Comment
Please, Sign In to add comment