wtgeographer

web2csv

Oct 19th, 2016
279
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
  1. import os
  2.  
  3. os.chdir(r'C:\Users\<user>\Desktop')
  4.  
  5. from bs4 import BeautifulSoup
  6. import requests
  7. import pandas as pd
  8.  
  9. url = "https://www.akc.org/reg/dogreg_stats.cfm"
  10. r = requests.get(url)
  11. data = r.text
  12. soup = BeautifulSoup(data)
  13.  
  14. table = soup.find_all('table')[0]
  15. rows = table.find_all('tr')[2:]
  16.  
  17. data = {
  18.     'breeds' : [],
  19.     'rank2015' : [],
  20.     'rank2014' : [],
  21.     'rank2013' : []
  22. }
  23.  
  24. for row in rows:
  25.     cols = row.find_all('td')
  26.     data['breeds'].append( cols[0].get_text().encode('utf-8').strip() )
  27.     data['rank2015'].append( cols[1].get_text().encode('utf-8').strip() )
  28.     data['rank2014'].append( cols[2].get_text().encode('utf-8').strip() )
  29.     data['rank2013'].append( cols[3].get_text().encode('utf-8').strip() )
  30.  
  31. dogData = pd.DataFrame( data )
  32. dogData.to_csv("AKC_Dog_Registrations.csv")
RAW Paste Data

Adblocker detected! Please consider disabling it...

We've detected AdBlock Plus or some other adblocking software preventing Pastebin.com from fully loading.

We don't have any obnoxious sound, or popup ads, we actively block these annoying types of ads!

Please add Pastebin.com to your ad blocker whitelist or disable your adblocking software.

×