Advertisement
Guest User

Untitled

a guest
Apr 28th, 2015
189
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 0.90 KB | None | 0 0
  1. For each word in the corpus, determine (all or some) N-Grams which are synonyms of that word and
  2. have the same hashcode on white space removal. Essentially, split the word (introduce whitespace)
  3. such that the resulting N-Gram is a synonym. Please find the list of words in the zip file attached.
  4.  
  5. What is an N-Gram?
  6. In the fields of computational linguistics and probability, an n-gram is a contiguous
  7. sequence of n items from a given sequence of text or speech.
  8.  
  9. Feel free to use any open source project or library. Please read about WordNet Similarity.
  10.  
  11.  
  12. Example
  13.  
  14. Input
  15. activewear
  16. basketball
  17. milk
  18. jeans
  19.  
  20.  
  21. Output
  22. activewear - active wear
  23. basketball - basket ball
  24. milk - NA
  25. jeans - NA
  26.  
  27. Note
  28. - NA means not available
  29. - sports wear, sportswear may be synonyms of activewear but do not have the same
  30. hashcode on white space removal.
  31. - Any doubts in the question above can be sent to "abhisheksh AT unbxd dot com"
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement