Advertisement
Guest User

Untitled

a guest
May 28th, 2015
253
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
Python 0.44 KB | None | 0 0
  1. import requests
  2. from bs4 import BeautifulSoup
  3. from collections import Counter
  4. from string import punctuation
  5.  
  6. r = requests.get("http://en.wikipedia.org/wiki/Wolfgang_Amadeus_Mozart")
  7.  
  8. soup = BeautifulSoup(r.content)
  9.  
  10. text = (''.join(s.findAll(text=True)) for s in soup.findAll('p'))
  11.  
  12. c = Counter((x.rstrip(punctuation).lower() for y in text for x in y.split()))
  13.  
  14. print(c.most_common()) # prints most common words staring at most common.
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement