Advertisement
Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- import requests
- import lxml.html
- import cssselect
- req = requests.get('http://tdcj.state.tx.us/unit_directory/')
- root = lxml.html.fromstring(req.text)
- tables = root.cssselect('table')
- table = tables[0]
- rows = table.cssselect('tr')
- rows = rows[1:]
- for row in rows:
- cells = row.cssselect('td')
- print cells[0].text_content()
- print cells[0].get('href')
- raw_input()
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement