Advertisement
Guest User

Untitled

a guest
Jul 7th, 2015
210
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 0.36 KB | None | 0 0
  1. inputfilename='file.html'
  2. data=urllib2.urlopen(inputfilename)
  3. soup = BeautifulSoup(data)
  4. data=soup.prettify()
  5. soup = BeautifulSoup(data)
  6. ti=soup.findAll(attrs={'class':'pno'})
  7. for t in ti:
  8. t.extract()
  9. ti=soup.findAll(attrs={'class':'subhead'})
  10. for t in ti:
  11. t.extract()
  12. lines=[]
  13. for s in soup(text=True):
  14. s=s.strip().replace('\t','')
  15. print s
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement