Advertisement
Guest User

Untitled

a guest
Oct 6th, 2016
143
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 4.07 KB | None | 0 0
  1. **------------------------------------------------------------------------------------------------
  2. * @header_start
  3. * WebGrab+Plus ini for grabbing EPG data from TvGuide websites
  4. * @Site: tv.com
  5. * @MinSWversion: 1.1.1/56.12
  6. * @Revision 4 - [24/07/2016] Blackbear199
  7. * - added retry=
  8. * @Revision 3 - [03/06/2016] Blackbear199
  9. * - fix - sometimes title not present on showdetails page
  10. * @Revision 2 - [12/01/2016] Francis De Paemeleere
  11. * - get all the data available (previous only 3 or 4 days were grabbed)
  12. * @Revision 1 - [05/01/2016] Jan van Straaten
  13. * - remove some special chars in the title (only seen on movies)
  14. * @Revision 0 - [03/11/2015] Jan van Straaten
  15. * - creation
  16. * @Remarks: directv alternative, less details
  17. * @header_end
  18. **------------------------------------------------------------------------------------------------
  19.  
  20. site {url=tv.com|timezone=UTC|maxdays=10|cultureinfo=en-US|charset=UTF-8|titlematchfactor=90|nopageoverlaps}
  21. site {loadcookie=tv.com.cookies.txt|retry=<retry channel-delay="5" index-delay="1" time-out="5">4</retry>}
  22. urldate.format {datestring|yyyy-MM-dd}
  23. *url_index{url|http://www.tv.com/listings/station/|channel|}
  24. url_index{url|http://www.tv.com/listings/singlestation/?start=##TIMESTAMP##&station=|channel|}
  25.  
  26. scope.range {(urlindex)|end}
  27. index_variable_element.modify {calculate(format=date,unix)|'urldate'}
  28. url_index.modify {replace|##TIMESTAMP##|'index_variable_element'}
  29. end_scope
  30.  
  31. url_index.headers {customheader=Accept-Encoding=gzip,deflate} * to speedup the downloading of the index pages
  32.  
  33. scope.range {(splitindex)|end}
  34. index_showsplit.scrub {single||||}
  35. index_showsplit.modify {cleanup(style=jsondecode)}
  36. index_showsplit.modify {substring(type=regex)|<li class="event row".+?</li>}
  37. index_showsplit.modify {cleanup(removeduplicates=equal,100)}
  38. end_scope
  39.  
  40. index_start.scrub {regex||data-start="(\d{10})"||}
  41. index_title.scrub {regex||<div class="title">(.+?)</div>||}
  42. index_title.modify {remove(type=regex)|"(<label>.+?</label>)"}
  43. index_title.modify {remove(type=regex)|"(<[^>]*>)"}
  44. index_description.scrub {regex||<div class="desc">(.*?)</div>||}
  45.  
  46. index_temp_1.scrub {regex||data-tmsid="rvp:(\d+?)"||} * id
  47. index_urlshow.modify {set('index_temp_1' not "")|http://www.tv.com/listings/event/?EventTmsId=rvp%3A'index_temp_1'}
  48. *http://www.tv.com/listings/event/?EventTmsId=rvp%3A1952005171
  49.  
  50. index_urlshow.headers {customheader=Accept-Encoding=gzip,deflate}* to speedup the downloading of the detail pages
  51.  
  52. title.scrub {regex||<h1>(?:<a href=.+?>)?(.+?)(?:</a>)?</h1>||}
  53. subtitle.scrub {regex||<h2>(?:<a href=.+?>)?(.+?)(?:</a>)?</h2>||}
  54. description.scrub {regex||<div class=\\"description\\">(.*?)</div>||}
  55. description.modify {remove|\}
  56. showicon.scrub {regex||data-image=\\"(http://.+?\.jpg)\\"||}
  57.  
  58. title.modify {remove(type=regex)|"(<label>.+?</label>)"}
  59. title.modify {remove(type=regex)|"(<[^>]*>)"}
  60. title.modify {addstart('title' "")|'index_title'}
  61. subtitle.modify {remove(type=regex)|"(<[^>]*>)"}
  62. category.modify {substring(type=regex)|'title' "<label>(.+?):\s?</label>"}
  63.  
  64. ** _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
  65. ** ##### CHANNEL FILE CREATION (only to create the xxx-channel.xml file)
  66. **
  67. ** @auto_xml_channel_start
  68. *scope.range {(channellist)|end}
  69. *url_index {url|http://www.tv.com/listings/}
  70. *index_site_channel.scrub {regex||<a href="/listings/station/\d+?".+?title="(.+?)"||}
  71. *index_site_id.scrub {regex||<a href="/listings/station/(\d+?)"||}
  72. *index_temp_6.scrub {regex||class="name">(.+?)(?:</a>\|</div>)||}
  73. *** add channel name makes it more clear?
  74. *index_temp_1.modify {set|0}
  75. *loop {(each "index_temp_2" in 'index_site_channel')|end}
  76. *index_temp_3.modify {substring(type=element)|'index_temp_6' 'index_temp_1' 1} * name
  77. *index_temp_4.modify {addend|'index_temp_2' ('index_temp_3')####}
  78. *index_temp_1.modify {calculate(format=F0)|1+}
  79. *end_loop
  80. *index_site_channel.modify {set|'index_temp_4'}
  81. *index_site_channel.modify {replace|####|\|}
  82. *index_site_channel.modify {cleanup(removeduplicates=equal,100 link="index_site_id")}
  83. *end_scope
  84. ** @auto_xml_channel_end
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement