Advertisement
Guest User

Untitled

a guest
Feb 9th, 2016
55
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 0.53 KB | None | 0 0
  1. # -*- coding: utf-8 -*-
  2. import MeCab, re
  3.  
  4. def getMorph(sentence):
  5. tagger=MeCab.Tagger()
  6. node=tagger.parseToNode(sentence).next
  7. morph=[]
  8. while node:
  9. morph.append(unicode(node.surface, "utf8"))
  10. node=node.next
  11. return morph
  12.  
  13. def cntword(words):
  14. word_count=[]
  15. for w in list(set(words)):
  16. word_count.append([w, words.count(w)])
  17. return sorted(word_count, key=lambda x: x[1], reverse=True)
  18.  
  19. with open("neko.txt", "r") as rfp:data=rfp.readlines()
  20.  
  21. morph=getMorph("".join(data))
  22. cnt=cntword(morph)
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement