Advertisement
BinYamin

fetching imdb ratings and summary

Aug 21st, 2016
496
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
Python 10.39 KB | None | 0 0
  1. '''
  2.     Author: Sayyad Shaha Hassan
  3.     Date:   21 August 2016
  4. if (you no like verbos):
  5.     hey there,
  6.     this program fetches the ratings, plot, genres and other stuff of the movies on your disk,
  7.     so you can decide which one to watch first and which ones to show to your little cousins.
  8.     It also lists movies according to their genres so if you would like to watch all the sci-fies first you can check that list and enjoy.
  9. elif (you enjoy reading):
  10.     With the rise of high capacity hard disks, we all love to keep as many movies and serials as we can.
  11.     And this gives rise to a new problem. We soemtimes end up having so many unseen movies in our harddisk that we can't decide which movie to start with.
  12.     I was gifted a hard disk by my beloved cowerkers and I immediately filled it up with everythng I could.
  13.     The when it came to watching movies I was so overwhelmed to see so many movies that I could not decide which ones are worth watching and of those which are worth it where should I begin.
  14.    
  15.     Then I realised I could use imdb to fetch the genres, rating and plots of the movies and I immediately made an automation to do exactly that.
  16.     Here I present to you a python script* that makes your work easier. You simply run it on a folder which contains your movies.
  17.     Internet needed here. Not much data will be consumed. Even 2G net is enough, a few kb for each folder.
  18.     You only need to keep one movie in one folder and the name of the folder should be renamed to the name of the movie. Now just run the script on the folder.
  19.     Voila!!
  20.     Each movie folder will now be renamed as "Movie name (ReleaseYear) [Rating**,imdbRating,RottenTomatoRating]"
  21.     **Rating as in R, PG etc.
  22.     Along with this, each folder will have a file that will contain the names of the actors, duration of the movie and the plot of the movie (from imdb).
  23.  
  24.     ----
  25.     *script? But I thought this page is about C programmin you say?
  26.     Well, yes. But I don't get as much time with my job and lazying and catching and training pokemons around at other times. Pokemons? o.O Ahh! It's so I can walk and stay healthy plus the pokemons :P.
  27.     Also scripting is real cool. It helps you write so much in such less time. So well it's only good if you can write code in more than one language.
  28. '''
  29. #=================================================================
  30.  
  31. #Python code begins:
  32. import os, sys, glob, re, json, urllib
  33. from collections import defaultdict
  34.  
  35. longmsg = '''
  36. syntax:
  37.     MovieRatings.py RootFolder [-u|-s]
  38.     MovieRatings.py RootFolder -m CustomName
  39.            
  40.     -m manual override. In case you want to send the movie's name by yourself.
  41.         For eg MovieRatings.py D:\mymovies\someName -m "Different movie name than the folder's"
  42.        
  43.     -u flag undoes all the stuff. (Deletes summary files and also ratings from folder names)
  44.     -s summarize. Only makes the lists of movies by their genres.
  45.    
  46.     The movies should be kept as given below.
  47.     One movie per folder.
  48.     The parent folder of movie should contain
  49.             >> [SNo] Movie Name [(Year)]
  50.     SNo:    In case of sequels/prequels this can be a digit or two
  51.     Movie Name: Try to match it with movie title as much as possible. No spelling mistakes
  52.     Year:   Optional. Should be enclosed in round brackets.
  53.     3 Star Wars III
  54.     1 Madagascar (2005) This will work too.
  55.     Madagascar (2005)   This is the best
  56.     Madagascar          This is OK.
  57.  
  58.     movies
  59.     movies\Madagascar series
  60.     movies\Madagascar series\madagascar (2005)\file.mp4
  61.     movies\Madagascar series\madagascar 2 \file.avi
  62.     movies\Frozen\file.avi
  63.  
  64.     Output:
  65.     it will rename the folders so they show Name (Year) [Rating, imdbRating, tomatoRating]
  66.     and drop a summary file in the folder. which contains plot and other details of the movie.
  67.  
  68.  
  69.     ProTips:
  70.     Use full name only 'Part II' != 'Part 2'
  71.     do not skip words in between. 'hangover part' is still ok, 'the hangover II' is not. (for searching 'the hangover part II')
  72.      
  73.  
  74.     Author: Sayyad Shaha Hassan
  75.     Date: 21/Aug/2016
  76. '''
  77.  
  78. strFileMatch = "sh .*\(.+\) \[.*\]\.txt"
  79. reFileMatch = re.compile(strFileMatch)
  80. strYear = "(.*)(\()(\d{4})(\))(.*)"
  81. reYear = re.compile(strYear)
  82. dictReYearMatch = {"name":1, "bracket1":2, "year":3, "bracket2":4, "remName":5}
  83. strRating = "\[.{1,10},.{1,5},.{1,5}\]"
  84.  
  85. def fSummarize(dpRoot):
  86.     for root, dirs, files in os.walk(dpRoot):
  87.         for file in files:
  88.             if reFileMatch.search(file):
  89.                 with open(os.path.join(root,file)) as fIn:
  90.                     data = fIn.read()
  91.                 lines = data.split("\n")
  92.                 sTitle = lines[1]
  93.                 sGenres = lines[3][lines[3].find(":") + 1:]
  94.                 lGenres = sGenres.split(",")
  95.                 dctDic = defaultdict(list)
  96.                 for sGenre in lGenres:
  97.                     dctDic[sGenre].append(sTitle)
  98.                     tfpGenreList = os.path.join(dpRoot, sGenre)  + ".txt"
  99.                     try:
  100.                         with open(tfpGenreList) as fIn:
  101.                             movieList = fIn.read()
  102.                         movieList = movieList.split("\n")
  103.                         if sTitle in movieList:
  104.                             continue
  105.                     except:
  106.                         pass
  107.                     with open(tfpGenreList, "a") as fOut:
  108.                         fOut.write(sTitle + "\n")
  109.     return
  110.  
  111. def fMakeFileNRenFolder(jsonResponse, root, PDir):
  112.     if not jsonResponse:
  113.         return None
  114.     #formatting json response
  115.     resTitle = fMakePathSafe(jsonResponse['Title'])
  116.     resYear = fMakePathSafe(jsonResponse["Year"])
  117.     resRated = fMakePathSafe(jsonResponse["Rated"])
  118.     resimdbRate = fMakePathSafe(jsonResponse['imdbRating'])
  119.     resTomatoUserRate = fMakePathSafe(jsonResponse['tomatoUserRating'])
  120.     resGenre  = fMakePathSafe(jsonResponse['Genre'])
  121.     resActors = fMakePathSafe(jsonResponse['Actors'])
  122.     resRuntime = fMakePathSafe(jsonResponse['Runtime'])
  123.     0  
  124.     tYear = ""
  125.     if not reYear.search(PDir):
  126.         tYear = ' (' + resYear + ')'
  127.        
  128.     dnNewPDirName = PDir + tYear + " [" + resRated + "," + resimdbRate + "," + resTomatoUserRate + "]"
  129.     newRoot = root
  130.     if root[2] != "\\":
  131.         newRoot = root[:2] + "\\" + root[2:]
  132.  
  133.     fnSumFile = root + "\\sh " + fMakePathSafe(resTitle) + " (" + resYear + ") [" + resRated + "," + resimdbRate + "," + resTomatoUserRate + "].txt"
  134.     with open(fnSumFile, "w") as fOut:
  135.         fOut.write("\n" + resTitle + " (" + resYear + ")")
  136.         fOut.write("\nRated: " + resRated)
  137.         fOut.write("\nGenre: " + resGenre)
  138.         fOut.write("\nActors: " + resActors)
  139.         fOut.write("\nRuntime: " + resRuntime)
  140.        
  141.         fOut.write("\n\n Ratings: ")
  142.         fOut.write("\nMetascore:        " + jsonResponse['Metascore'])
  143.         fOut.write("\nimdbRating:       " + jsonResponse['imdbRating'])
  144.         fOut.write("\ntomatoRating:     " + jsonResponse['tomatoRating'])
  145.         fOut.write("\ntomatoUserRating: " + jsonResponse['tomatoUserRating'])
  146.        
  147.         fOut.write("\n\nPlot Summary(Full) \n" + jsonResponse['Plot'].encode('ascii','ignore'))
  148.         fOut.write("\n\nSayyad")
  149.  
  150.     try:
  151.         os.system('ren "' + newRoot + '", "' + dnNewPDirName + '"')
  152.     except:
  153.         print "could not rename(" +newRoot + "," + dnNewPDirName + ')'
  154.         pass
  155.     return
  156.  
  157. def fUnDoStuff(dpRoot):
  158.     count = 0
  159.     for root, dirs, files in os.walk(dpRoot):
  160.         for file in files:
  161.             if reFileMatch.search(file):
  162.                 os.remove(os.path.join(root, file))
  163.                 count += 1
  164.                
  165.                 dpPDir = root
  166.                 dnPDir = root[root.rfind("\\") + 1:]
  167.                 locBrack = dnPDir.rfind("[")
  168.                 if locBrack == -1:
  169.                     continue
  170.                 dnNewPDir = dnPDir[:locBrack]
  171.                 try:
  172.                     os.system('ren "' + dpPDir + '", "' + dnNewPDir + '"')
  173.                 except:
  174.                     print "could not rename(" +dpPDir + ", " + dnNewPDir + '). Sorry! '
  175.                     pass
  176.     print str(count) + " files deleted."
  177.     return
  178.  
  179. def fMakePathSafe(token):
  180.     chars = '/\\:?"*|<>'
  181.     token = token.encode('ascii','ignore')
  182.     for char in chars:
  183.         token = token.replace(char,'')
  184.     return token
  185.  
  186. def fTryUrl(sTitle, iYear):
  187.     if sTitle == "":
  188.         return None
  189.    
  190.     print sTitle +" :" + iYear
  191.     urlStubYear = ""
  192.     if iYear:
  193.         urlStubYear = "&y="
  194.  
  195.     urlFull= "http://www.omdbapi.com/?t={}{}{}&type={}&plot={}&tomatoes={}&r={}".format(urllib.quote(sTitle), urlStubYear, iYear, "movie", "full", "true", "json")
  196.     opener = urllib.FancyURLopener({})
  197.     try:
  198.         handle = opener.open(urlFull)
  199.     except:
  200.         return None
  201.     jsonData = json.load(handle)
  202.     if jsonData['Response'] == 'False':
  203.         return None
  204.     return jsonData
  205.  
  206. def fGetResponse(PDir):
  207.     match = reYear.search(PDir)    
  208.     iYear = ""
  209.     sTitle = PDir
  210.    
  211.     if match:
  212.         iYear = match.group(dictReYearMatch["year"])
  213.         if int(iYear) < 1900 or int(iYear) > 2050:
  214.             iYear = ""
  215.         else:
  216.             sTitle = match.group(dictReYearMatch["name"]) + match.group(dictReYearMatch["remName"])
  217.     else:
  218.         iYear = ""
  219.  
  220.     sTitle = sTitle.strip()
  221.     sTitle = sTitle.replace(" ", "+")
  222.     if sTitle == "":
  223.         return None
  224.    
  225.     #in case of movie series where name is "2. Star Wars"
  226.     firstWord = sTitle[:sTitle.find("+")]
  227.     if not firstWord.isdigit():
  228.         firstWord = sTitle[:sTitle.find(".")]
  229.         if not firstWord.isdigit():
  230.             firstWord = ""
  231.  
  232.     sAltTitle = ""
  233.     if firstWord:
  234.         sAltTitle = sTitle[len(firstWord)+1:]
  235.  
  236.     jsonData = fTryUrl(sTitle, iYear)
  237.     if not jsonData:
  238.         jsonData = fTryUrl(sAltTitle, iYear)
  239.         if not jsonData:
  240.             jsonData = fTryUrl(sTitle, "")
  241.             if not jsonData:
  242.                 jsonData = fTryUrl(sAltTitle, "")
  243.                 if not jsonData:
  244.                     print "failed: not found"
  245.     return jsonData
  246.  
  247. def fHelp(minArgCount, maxArgCount):
  248.     msgHelp = """
  249.     invalid no of args. other blah blah
  250.     """
  251.     if not minArgCount:
  252.         minArgCount = 1
  253.     if not maxArgCount:
  254.         maxArgCount = minArgCount
  255.    
  256.     count = len(sys.argv)
  257.     if count < minArgCount or count > maxArgCount:
  258.         print msgHelp
  259.         print longmsg
  260.         exit()     
  261.     return 
  262. fHelp(2, 5)
  263.  
  264. dnRoot = sys.argv[1]
  265. #undo
  266. if "-u" in sys.argv:
  267.     fUnDoStuff(dnRoot)
  268.     exit()
  269. elif "-m" in sys.argv:
  270.     #title manual override
  271.     # PathPDir -m manualTitle [manualYear]
  272.     argC = 0
  273.     for arg in sys.argv:
  274.         if "-m" == arg:
  275.             break
  276.         argC += 1
  277.  
  278.     strManualTitle = sys.argv[argC + 1]
  279.     strManualYear = ""
  280.     try:
  281.         strManualYear = sys.argv[argC + 2]
  282.     except:
  283.         pass
  284.     jsonResponse = fTryUrl(strManualTitle, strManualYear)
  285.     PDir = dnRoot[dnRoot.rfind("\\") + 1:]
  286.     fMakeFileNRenFolder( jsonResponse, dnRoot, PDir)
  287.     exit()
  288. elif "-s" in sys.argv:
  289.     #summerize the movies
  290.     fSummarize(dnRoot)
  291.     exit()
  292. else:
  293.     #normal Execution
  294.     lastRoot = ""
  295.     for root, dirs, files in os.walk(dnRoot):
  296.         if not files:
  297.             continue
  298.         flag = 0
  299.         for file in files:
  300.             if file.startswith("sh "):
  301.                 flag = 1
  302.                 break
  303.         if flag: continue
  304.  
  305.         tRoot = root
  306.         if ":" in tRoot:
  307.             si = 2
  308.             if tRoot[2] == "\\":
  309.                 si += 1
  310.         else:
  311.             si = 0
  312.         tRoot = tRoot[si:]
  313.         PDir = tRoot[tRoot.rfind("\\")+1:]
  314.         print "working on... ",
  315.         print PDir
  316.        
  317.         jsonResponse = fGetResponse(PDir)
  318.         if not jsonResponse: continue
  319.        
  320.         fMakeFileNRenFolder(jsonResponse, root, PDir)
  321.     fSummarize(dnRoot)
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement