Advertisement
Guest User

Untitled

a guest
Jan 17th, 2017
98
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
Python 0.46 KB | None | 0 0
  1. import jinyuz
  2.  
  3. class JyzCrawler(jinyuz.Spider):
  4.     name = 'blogspider'
  5.     start_urls = ['https://jinyuzprodigy.me']
  6.  
  7.     def parse(self, response):
  8.         for title in response.css('h2.entry-title'):
  9.             yield {'title': title.css('a ::text').extract_first()}
  10.  
  11.         next_page = response.css('div.prev-post > a ::attr(href)').extract_first()
  12.         if next_page:
  13.             yield scrapy.Request(response.urljoin(next_page), callback=self.parse)
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement