Advertisement
SvetlozarDraganov

Untitled

Aug 29th, 2018
155
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 1.71 KB | None | 0 0
  1. forum_loader = ItemLoader(item=forumItem(), selector=threadSel)
  2. forum_loader.default_output_processor = TakeFirst() #take only the first item of the array of scraped data
  3. forum_loader.add_xpath('thread_name', ".//td[@class='cell-topic js-cell-topic']/div[@class='topic-wrapper js-topic-wrapper h-wordwrap']/a[@class='topic-title js-topic-title']/text()")
  4. forum_loader.add_xpath('url',".//td[@class='cell-topic js-cell-topic']/div[@class='topic-wrapper js-topic-wrapper h-wordwrap']/a[@class='topic-title js-topic-title']/@href")
  5. forum_loader.add_xpath('url_id',".//td[@class='cell-topic js-cell-topic']/div[@class='topic-wrapper js-topic-wrapper h-wordwrap']/a[@class='topic-title js-topic-title']/@href")
  6. forum_loader.add_xpath('responses',".//td[@class='cell-count']/div[@class='posts-count']/text()")
  7. forum_loader.add_xpath('dateStarted',".//td[@class='cell-topic js-cell-topic']/div[@class='topic-info h-clear h-hide-on-small h-hide-on-narrow-column']/span[@class='date']/text()")
  8. forum_loader.add_xpath('dateLastUpdated',".//td[@class='cell-lastpost']/span[@class='post-date']/text()")
  9. forum_loader.add_xpath('lastPostBy',".//td[@class='cell-lastpost']/div[@class='lastpost-by']/a/text()")
  10. forum_loader.add_xpath('forum_section',".//td[@class='cell-topic js-cell-topic']/div[@class='topic-info h-clear h-hide-on-small h-hide-on-narrow-column']/span[@class='f-title']/a/text()")
  11. # yield forum_loader.load_item()
  12. # print('>>>%s' % url)
  13.  
  14. yield scrapy.Request(url, callback=self.parse_thread_details, meta={'itemLoader': forum_loader.load_item()})
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement