Advertisement
Try95th

linkToSoup_scrapingAnt

Nov 14th, 2022 (edited)
315
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
Python 1.61 KB | None | 0 0
  1. ## FIRST, register & get an API Token from https://scrapingant.com/
  2.  
  3. ## facilitates simple requests+bs4 for sites with blockers, but
  4. ## !free tier allows a limited number of requests per month
  5. ## !it can be very slow (but the proxies can be useful)
  6.  
  7. ## request/cloudscraper/HTMLSession version[s] at https://pastebin.com/rBTr06vy
  8. ## [simplified] selenium version at https://pastebin.com/VLZ2vPYK
  9. ## sample usage: similar to first example at https://pastebin.com/E3sCEr9r
  10.  
  11.  
  12. import requests
  13. from bs4 import BeautifulSoup
  14. import urllib.parse
  15.  
  16. def linkToSoup_scrapingAnt(url_to_Scrape, pCountry=None, setResid=False
  17.             apiKey=None, loadCss=None, fparser='html.parser', isv=True, returnErr=False):
  18.     defaultKey = 'YOUR_API_TOKEN' # paste here
  19.     sa_api = 'https://api.scrapingant.com/v2/general'
  20.     sa_key = str(apiKey) if apiKey else defaultKey
  21.  
  22.     qParams = {'url': url_to_Scrape, 'x-api-key': sa_key}
  23.     if setResid: qParams['proxy_type'] = 'residential' # more expensive
  24.     if pCountry: qParams['proxy_country'] = pCountry # more expensive
  25.     if loadCss: qParams['wait_for_selector'] = loadCss
  26.  
  27.     reqUrl = f'{sa_api}?{urllib.parse.urlencode(qParams)}'  
  28.     if isv: print('fetching with ScrapingAnt:', url_to_Scrape, '\nwith ', reqUrl)
  29.     r = requests.get(reqUrl)
  30.  
  31.     try:
  32.         if [*r.json()] == ['detail']:
  33.             errMsg = f'{r.json()["detail"]} [<response {r.status_code}> {r.reason}] {r.url}'
  34.     except:
  35.         if r.status_code == 200: return BeautifulSoup(r.content, fparser)
  36.         errMsg = f'failed to fetch page [{r.status_code} {r.reason}] {r.url}'
  37.     if isv: print(errMsg)
  38.     return errMsg if returnErr else None
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement