Advertisement
Guest User

Untitled

a guest
Aug 1st, 2010
224
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
Python 0.35 KB | None | 0 0
  1. import urllib
  2. import lxml.html
  3.  
  4. def get_pages(url):
  5.     page = lxml.html.parse(urllib.urlopen(url))
  6.     mp3 = page.xpath("//a[contains(@href, '.mp3')]")
  7.     for item in mp3: print item.get('href')
  8.     nextpage = page.xpath("//span[text()='Next']/..")
  9.     if nextpage: get_pages(nextpage[0].get('href'))
  10.  
  11. get_pages("http://thehistoryofrome.typepad.com/")
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement