Advertisement
Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- import os, sys
- import urllib, re, cgi
- import HTMLParser, unicodedata
- for URL in open('list.txt','r').readlines():
- # print("URL =", URL)
- ## Parse playlist title
- #
- these_regex="<title>(.+?)</title>"
- pattern=re.compile(these_regex)
- htmlfile=urllib.urlopen(URL)
- htmltext=htmlfile.read()
- title=re.findall(pattern,htmltext)
- title=HTMLParser.HTMLParser().unescape(title)[0]
- print "Title = ", title.decode()
- ----- RESULT --------
- Title = Wizo - Anderster Full Album - YouTube
- Title = Wizo - Bleib Tapfer / für'n Arsch Full Album - YouTube
- Title = WIZO - Uuaarrgh Full Album - YouTube
- Title = WIZO - Full Album - "Punk gibt's nicht umsonst! (Teill III)" - YouTube
- Title = WIZO - Full Album - "DER" - YouTube
- Title = Alarmsignal - Wir leben - YouTube
- Title = the Pogues - Body of an american - YouTube
- Title = The Pogues - The band played waltzing matilda - YouTube
- Title = Hey Rote Zora - Heiter bis Wolkig - YouTube
- Title = Für immer Punk - die goldenen Zitronen - YouTube
- Title = Fuckin' Faces - Krieg und Frieden - YouTube
- Title = Sluts - Anders - YouTube
- Title = Absturz - Es ist schön ein Punk zu sein - YouTube
- Title = Broilers - Ruby Light & Dark - YouTube
- Title = Less Than Jake 02 - My Very Own Flag - YouTube
- Title = The Mighty Mighty Bosstones - The Impression That I Get - YouTube
- Title = Streetlight Manifesto - Failing Flailing (lyrics) - YouTube
- Title = Mustard Plug - Mr. Smiley - YouTube
- But when i try:
- os.mkdir(title)
- i get the following:
- UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 23: ordinal not in range(128)
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement