ethanweed

find unigrams+bigrams

Sep 28th, 2020
775
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
  1. word = "bold"
  2.  
  3. def seq_ngrams(xs, n):
  4.     return [xs[i:i+n] for i in range(len(xs)-n+1)]
  5.  
  6. def shingle(text, w):
  7.     tokens = list(text)
  8.     return [' '.join(xs) for xs in seq_ngrams(tokens, w)]
  9.  
  10. unigrams = list(word)
  11.  
  12. bigrams = shingle(word,2)
  13.  
  14. spellings = unigrams + bigrams
  15.  
  16. print(spellings)
RAW Paste Data