Advertisement
Guest User

Untitled

a guest
Jul 23rd, 2019
136
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 1.29 KB | None | 0 0
  1. corpuses = {
  2. "The Wasteland": "I will show you fear in a handful of dust. In the mountains there you feel free. A crowd flowed over London Bridge.",
  3. "The Road Not Taken": "Two roads diverged in a yellow wood. I kept the first for another day. I doubted if I should ever come back.",
  4. "Trees": "I think that I shall never see a poem lovely as a tree. A tree that looks at God all day and lifts her leafy arms to pray."
  5. }
  6.  
  7. def digest(corpus):
  8. """digests a single piece of corpus into a dictionary of the form {word: {sentence: frequency}}"""
  9. result = {}
  10. for sentence in corpus.split("."):
  11. words = [word.lower().strip() for word in sentence.strip().split()]
  12. for word in words:
  13. frequency = words.count(word)
  14. result.setdefault(word, {})[sentence] = frequency
  15. return result
  16.  
  17. def combine_corpuses(corpuses):
  18. result = {}
  19. all_words = {word for corpus in corpuses for word in corpus.keys()}
  20. for word in all_words:
  21. combined_d = {}
  22. for corpus in corpuses:
  23. combined_d.update(corpus.get(word, {}))
  24. result[word] = combined_d
  25. return result
  26.  
  27. digested_corpuses = [digest(corpus) for title, corpus in corpuses.items()]
  28. single_digest = combine_corpuses(digested_corpuses)
  29. print(single_digest)
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement