Advertisement
Try95th

linkToSoup_selenium - simplified

Dec 2nd, 2022 (edited)
525
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
Python 1.29 KB | None | 0 0
  1. ### Takes a URL and returns a BeautifulSoup object (or None/errorMsg if there is an error) ###
  2. ### [ For when BeautifulSoup(requests.get(url).content) is not enough ] ######################
  3.  
  4. ## full version at https://pastebin.com/kEC9gPC8
  5. ## requests-based version/s at https://pastebin.com/rBTr06vy and https://pastebin.com/5ibz2F6p
  6.  
  7. ## [if you want a quick tutorial on selenium, see https://www.scrapingbee.com/blog/selenium-python/]
  8. #### REQUIRED: download chromedriver.exe from https://chromedriver.chromium.org/downloads ####
  9. #### [AND copy chromedriver.exe to the same folder as this py file] ####
  10.  
  11.  
  12. import time
  13. from bs4 import BeautifulSoup
  14. from selenium import webdriver
  15.  
  16. def linkToSoup_selenium(lUrl, tmout=None, fparser='html.parser', isv=True, returnErr=False):
  17.     try:
  18.         # I copy chromedriver.exe to the same folder as this py file
  19.         driver = webdriver.Chrome()
  20.         driver.maximize_window()
  21.         driver.get(lUrl)
  22.         if type(tmout) in [int, float]: time.sleep(tmout)
  23.  
  24.         lSoup = BeautifulSoup(driver.page_source, fparser)
  25.         driver.close()
  26.         del driver  # (just in case)
  27.         return lSoup
  28.     except Exception as e:
  29.         if isv: print(str(e))   ## set isv=False to suppress error message ##
  30.         return str(e) if returnErr else None
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement