Advertisement
Guest User

Untitled

a guest
Jul 23rd, 2019
113
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 1.37 KB | None | 0 0
  1. corpuses = {
  2. "The Wasteland": "I will show you fear in a handful of dust. In the mountains there you feel free. A crowd flowed over London Bridge.",
  3. "The Road Not Taken": "Two roads diverged in a yellow wood. I kept the first for another day. I doubted if I should ever come back.",
  4. "Trees": "I think that I shall never see a poem lovely as a tree. A tree that looks at God all day and lifts her leafy arms to pray."
  5. }
  6.  
  7. def digest(corpus):
  8. """digests a single piece of corpus into a dictionary of the form {word: {sentence: frequency}}"""
  9. result = {}
  10. for sentence in corpus.split("."):
  11. sentence = sentence.strip()
  12. words = [word.lower().strip() for word in sentence.split()]
  13. for word in words:
  14. frequency = words.count(word)
  15. result.setdefault(word, {})[sentence] = frequency
  16. return result
  17.  
  18. def combine_corpuses(corpuses):
  19. result = {}
  20. all_words = {word for corpus in corpuses for word in corpus.keys()}
  21. for word in all_words:
  22. combined_d = {}
  23. for corpus in corpuses:
  24. combined_d.update(corpus.get(word, {}))
  25. result[word] = combined_d
  26. return result
  27.  
  28. digested_corpuses = [digest(corpus) for title, corpus in corpuses.items()]
  29. single_digest = combine_corpuses(digested_corpuses)
  30. for k, v in single_digest.items():
  31. if len(v) > 1:
  32. print(k, v)
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement