Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- # Here is a picture of all the tables that are created through the 3 scripts: http://snag.gy/cySx0.jpg
- require 'nokogiri'[]
- #This script will go to the link below, and extract all of the Genre names and Genre IDs.
- #The output will be in the form of two arrays, one for names and one for links.
- clubland = "http://www.clublandlv.com/forum.php"
- doc = Nokogiri::HTML(open(clubland))
- doc.css("#c_cat10").each do |grab|
- genres = grab.css(".forumtitle").map(&:text)
- genre_links = grab.css(".forumtitle"){[:href]}
- genre_links = genre_links.map do |link|
- link_id = link.children.first["href"]
- CGI.parse(URI.parse(link_id).query)['f'].first.to_i
- end
- #This part of the script will import the results to the database, in the specified columns.
- DB = Sequel.connect('sqlite:///Users/RyanOConnor/workspace/testing/clublandlv.sqlite')
- genre_db = DB[:genres]
- genre_db.import([:genre, :genre_id], genres.zip(genre_links))
- end
- #The exact same script is tweaked to pull the Subgenre data. Right now I have it hardcoded to go to one of the Genre #pages which contain the Subgenre data, but I want to add a part to the script that will grab each GENRE_ID from the genres table and run the script for each.
- #For example, genre links are "http://www.clublandlv.com/forumdisplay.php?f=46", where the ?F={GENREID}.
- # Looking at the picture, all of the entries in the Subgenres table are children of "Dirty House Music", genre_id 46.
- # Whatever genre_id that is currently being used, the genre name should be added to that column of the Subgenre table.
- #The songs table will have similar functionality, with tables linking to each other ON certain conditions.
Advertisement
Add Comment
Please, Sign In to add comment