Advertisement
febrezo

Python webcrawler using TOR

Nov 26th, 2012
678
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
Python 0.81 KB | None | 0 0
  1. ## Credit: http://stackoverflow.com/questions/1096379/tor-with-python
  2. import urllib2
  3.  
  4. url = "http://www.felixbrezo.com"
  5.  
  6. ## manejo del proxy
  7. proxy = urllib2.ProxyHandler({'http': '127.0.0.1:8118'})
  8. opener = urllib2.build_opener(proxy)
  9. urllib2.install_opener(opener)
  10.  
  11. file_name = url.split('/')[-1]
  12. u = urllib2.urlopen(url)
  13. f = open(file_name, 'wb')
  14. meta = u.info()
  15. file_size = int(meta.getheaders("Content-Length")[0])
  16. print "Downloading: %s Bytes: %s" % (file_name, file_size)
  17.  
  18. file_size_dl = 0
  19. block_sz = 8192
  20. while True:
  21.     buffer = u.read(block_sz)
  22.     if not buffer:
  23.         break
  24.  
  25.     file_size_dl += len(buffer)
  26.     f.write(buffer)
  27.     status = r"%10d  [%3.2f%%]" % (file_size_dl, file_size_dl * 100. / file_size)
  28.     status = status + chr(8)*(len(status)+1)
  29.     print status,
  30.  
  31. f.close()
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement