Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- # n-tokenization for python sentences
- #open file - attached with the gist
- words=open('thought.txt','r')
- #get the content in an str
- words_imp=words.read()
- #close file.
- words.close()
- #create the content into a list
- words_imp=words_imp.split(' ')
- #decide word span for tokens : n
- n=3
- #make sublists of 3 words each
- i = [words_imp[i:i+n] for i in range(0,len(words_imp),n)]
- # i is a list of lists containing 3 words each from the fetched content
- #f is a list of strings
- f=[' '.join(a) for a in i]
- #result is like
- # f= ['i think that', 'the whole intuition', 'of the wokrplace', 'owns you is'........]
Add Comment
Please, Sign In to add comment