Nov 4th, 2019
  1. "title","url","descripiton"
  2. "NYC","https://opendata.cityofnewyork.us/","Datasets from NYC"
  3. "Federal OpenData","http://opendata.dc.gov/","The Federal OpenData site"
  4. "Virginia","https://data.virginia.gov/","State of Virginia public datasets"
  5. "Open Gov","https://www.data.gov/open-gov/","Open Gov datasets"
  6. "AssetMacro","http://www.assetmacro.com/market-data","historical data of Macroeconomic Indicators and Market Data"
  7. "Awesome Public Datasets on github","https://github.com/caesar0301/awesome-public-datasets","curated by caesar0301"
  8. "AWS Public Data Sets","http://aws.amazon.com/publicdatasets/","Provides a centralized repository of public data sets that can be seamlessly integrated into AWS cloud-based applications."
  9. "BigML","http://blog.bigml.com/2013/02/28/data-data-data-thousands-of-public-data-sources/#comment-7538","A big list of public data sources"
  10. "Bioassay Data","http://www.jcheminf.com/content/pdf/1758-2946-1-21.pdf","Bioassay data, described in Virtual screening of bioassay data, by Amanda Schierz, J. of Cheminformatics"
  11. "Bitly 1.usa.gov","http://www.usa.gov/About/developer-resources/1usagov.shtml","Bitly 1.usa.gov data, anonymized clicks on gov links"
  12. "Canada Open Data","http://www.data.gc.ca/","Pilot project with many government and geospatial datasets"
  13. "Causality Workbench","http://www.causality.inf.ethz.ch/repository.php","Causality Workbench data repository"
  14. "Corral Big Data","http://www.tacc.utexas.edu/resources/data-storage/#corral","Corral Big Data repository at Texas Advanced Computing Center, supporting data-centric science"
  15. "CrowdFlower","http://www.crowdflower.com/data-for-everyone","Data for Everyone library"
  16. "Data Source Handbook","http://shop.oreilly.com/product/0636920018254.do","A Guide to Public Data, by Pete Warden, O'Reilly (Jan 2011)"
  17. "Datacatalogs.org","http://datacatalogs.org/","Open government data from US, EU, Canada, CKAN, and more"
  18. "Data.gov.uk","http://data.gov.uk/","Publicly available data from UK (also London datastore. http://data.london.gov.uk/)"
  19. "Data.gov/Education","http://www.data.gov/education","Central guide for education data resources including visualization tools, classroom resources, applications and more"
  20. "DataMarket","http://datamarket.com/","Visualize the world's economy, societies, nature, and industries"
  21. "Datamob","http://datamob.org/","Public data put to good use"
  22. "Data Planet","http://www.data-planet.com/","The largest repository of standardized and structured statistical data, with over 25 billion data points, 4.3 billion datasets, 400+ source databases"
  23. "Datasets.co","http://www.datasets.co/","Datasets for data geeks, find and share Machine Learning datasets"
  24. "DataSF.org","http://datasf.org/","A clearinghouse of datasets available from the City & County of San Francisco, CA"
  25. "DataFerrett","http://dataferrett.census.gov/","A data mining tool that accesses and manipulates TheDataWeb, a collection of many on-line US Government datasets"
  26. "Delve","http://www.cs.toronto.edu/~delve","Data for Evaluating Learning in Valid Experiments"
  27. "EconData","http://inforumweb.umd.edu/econdata/econdata.html","Thousands of economic time series, produced by a number of US Government agencies"
  28. "data.world","https://data.world/","Discover and share cool data, connect with interesting people, and work together to solve problems faster"
  29. "Enron Email Dataset","http://www.cs.cmu.edu/~enron/","Data from about 150 users, mostly senior management of Enron"
  30. "Europeana Data","http://data.europeana.eu/","Contains open metadata on 20 million texts, images, videos and sounds gathered by Europeana - the trusted and comprehensive resource for European cultural heritage content"
  31. "FEDSTATS","http://www.fedstats.gov/","A comprehensive source of US statistics and more"
  32. "FIMI","http://fimi.cs.helsinki.fi/","Repository for frequent itemset mining, implementations and datasets"
  33. "Financial Data Finder","http://fisher.osu.edu/fin/fdf/osudata.htm","The Financial Data Finder at OSU is a large catalog of financial data sets"
  34. "GDELT","http://www.guardian.co.uk/news/datablog/2013/apr/12/gdelt-global-database-events-location","GDELT: The Global Data on Events, Location and Tone"
  35. "GEO Gene Expression Omnibus","http://www.ncbi.nlm.nih.gov/geo/","A gene expression/molecular abundance repository supporting MIAME compliant data submissions"
  36. "GeoDa Center","http://geodacenter.asu.edu/datalist/","Geographical and spatial data"
  37. "Google ngrams datasets","http://ngrams.googlelabs.com/datasets","Text from millions of books scanned by Google"
  38. "Grain Market Research","http://www.grainmarketresearch.com/","Financial data including stocks, futures, etc"
  39. "Hilary Mason Big Data Sets","https://bitly.com/bundles/hmason/1","Hilary Mason research-quality Big Data sets collection including many text and image datasets"
  40. "HitCompanies Datasets","http://endb-consolidated.aihit.com/datasets.htm","Comprehensive data on random 10,000 UK companies sampled from HitCompanies, updated automatically using AI/Machine Learning"
  41. "ICWSM-2009","http://www.icwsm.org/2009/data/","The ICWSM-2009 dataset contains 44 million blog posts made between August 1st and October 1st, 2008"
  42. "Infochimps","http://infochimps.org/","An open catalog and marketplace for data. You can share, sell, curate, and download data about anything and everything"
  43. "Investor Links","http://www.investorlinks.com/","Includes financial data"
  44. "Kaggle Datasets","https://www.kaggle.com/datasets",""
  45. "KDD Cup center","http://www.sigkdd.org/kddcup/index.php","KDD Cup center, with all data, tasks, and results"
  46. "Kevin Chai list of datasets","http://kevinchai.net/datasets/","For text, SNA, and other fields"
  47. "KONECT: the Koblenz Network Collection","http://konect.uni-koblenz.de/","Large network datasets of all types in order to perform research in the area of network mining"
  48. "Linking Open Data project","http://www.w3.org/wiki/SweoIG/TaskForces/CommunityProjects/LinkingOpenData","Making data freely available to everyone"
  49. "Million Song Dataset","http://labrosa.ee.columbia.edu/millionsong/",""
  50. "MIT Cancer Genomics","http://www-genome.wi.mit.edu/cgi-bin/cancer/datasets.cgi","MIT Cancer Genomics gene expression datasets and publications, from MIT Whitehead Center for Genome Research"
  51. "ML Data","http://mldata.org/","The data repository of the EU Pascal2 networks"
  52. "NASDAQ Data Store","https://data.nasdaq.com/","Provides access to market data"
  53. "National Government Statistical Web Sites","http://www.archive-it.org/","Data, reports, statistical yearbooks, press releases, and more from about 70 web sites, including countries from Africa, Europe, Asia, and Latin America"
  54. "National Space Science Data Center (NSSDC)","http://nssdc.gsfc.nasa.gov/","NASA data sets from planetary exploration, space and solar physics, life sciences, astrophysics, and more"
  55. "NetworkRepository","http://www.networkrepository.com/","Interactive Data Repository, has many collections of graph and networks from social science, machine learning, scientific computing, and other areas"
  56. "Open Data Census","http://census.okfn.org/","Assessments of the state of open data around the world"
  57. "Socrata OpenData","http://opendata.socrata.com/","OpenData from Socrata, access to over 10,000 datasets including business, education, government, and fun."
  58. "Open Source Sports","http://www.opensourcesports.com/","Open Source Sports, many sports databases, including Baseball, Football, Basketball, and Hockey."
  59. "Peter Skomoroch dataset Bookmarks","http://www.delicious.com/pskomoroch/dataset",""
  60. "PubGene(TM)","http://www.pubgene.org/","Gene Database and Tools, genomic-related publications database"
  61. "Quandl","http://www.quandl.com/","A collaboratively curated portal to millions of financial and economic time-series datasets"
  62. "qunb","http://www.qunb.com/","A platform to find and visualize quantitative data."
  63. "Robert Schiller data","http://www.econ.yale.edu/~shiller/data.htm","Data on housing, stock market, and more from his book Irrational Exuberance"
  64. "Stanford Microarray Database","http://genome-www5.stanford.edu/MicroArray/SMD/","The Stanford Microarray Database stores raw and normalized data from microarray experiments"
  65. "Jerry Smith dataset collection","http://datascientistinsights.com/2013/02/02/data-monetization-road-paved-on-top-of-data-sets/","Finance, Government, Machine Learning, Science, and other data"
  66. "SourceForge.net Research Data","http://www.nd.edu/~oss/Data/data.html","Includes historic and status statistics on approximately 100,000 projects and over 1 million registered users' activities at the project management web site."
  67. "StatLib","http://lib.stat.cmu.edu/datasets/","StatLib, CMU Datasets Archive"
  68. "STATOO Datasets (part 1)","http://www.statoo.com/en/resources/anthill/Datamining/Data/",""
  69. "STATOO Datasets (part 2)","http://www.statoo.com/en/resources/anthill/Data_Sets/",""
  70. "Time Series Data Library","http://robjhyndman.com/TSDL/",""
  71. "Visual Analytics Benchmark Repository","http://hcil.cs.umd.edu/localphp/hcil/vast/archive/viewbm.php",""
  72. "UCI KDD Database Repository","http://kdd.ics.uci.edu/","Large datasets used in machine learning and knowledge discovery research"
  73. "UCI Machine Learning Repository","http://www.cs.ucr.edu/~eamonn/time_series_data/",""
  74. "UCR Time Series Data Archive","http://www.cs.ucr.edu/~eamonn/time_series_data/","datasets, papers, links, and code"
  75. "UK Open Postcode Geo","https://www.getthedata.com/open-postcode-geo","UK Open Postcode Geo, UK/British postcodes with easting, northing, latitude, and longitude"
  76. "United States Census Bureau","http://www.census.gov/",""
  77. "Web Data Commons","http://webdatacommons.org/","Structured data from the Common Crawl, the largest public web corpus"
  78. "Webhose free datasets","https://webhose.io/datasets",""
  79. "Wikiposit","http://wikiposit.org/","A (virtual) amalgamation of (mostly financial) data from many different sites, allowing users to merge data from different sources"
  80. "Wolfram Alpha Medical","http://blog.wolframalpha.com/2010/06/29/disease-and-patient-level-statistics-with-wolframalpha/","Wolfram Alpha's disease and patient level data"
  81. "Yahoo Sandbox datasets","http://webscope.sandbox.yahoo.com/catalog.php","Yahoo Sandbox datasets, Language, Graph, Ratings, Advertising and Marketing, Competition"
  82. "Yelp Academic Dataset","http://www.yelp.com/academic_dataset","All the data and reviews of the 250 closest businesses for 30 universities for students and academics to explore and research"
