Advertisement
sbmonzur

WebScraping and Saving Data

Mar 10th, 2021
141
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
Python 1.07 KB | None | 0 0
  1. #I scraped the links using beautiful soup (code not included here), and then from those links downloaded the specific html content of the articles I was interested in (titles, dates, names of contributor, main texts) and stored that information in a list. I then saved the list to a text file.
  2.  
  3. for link in urlsPA:
  4.     specificpagePA=requests.get(link) #making a get request and stores the response in an object
  5.     rawAddPagePA=specificpagePA.text # read the content of the server’s response
  6.     PASoup2=BeautifulSoup(rawAddPagePA) # parse the response into an HTML tree
  7.     PAcontent=PASoup2.find_all(class_=["story-element story-element-text", "time-social-share-wrapper storyPageMetaData-m__time-social-share-wrapper__2-RAX", "headline headline-type-9 story-headline bn-story-headline headline-m__headline__3vaq9 headline-m__headline-type-9__3gT8S", "contributor-name contributor-m__contributor-name__1-593"])
  8.     print(PAcontent)
  9.     PAlist.append(PAcontent)
  10.  
  11. with open('listfile.txt', 'w') as filehandle:
  12.     for listitem in PAlist:
  13.         filehandle.write('%s\n' % listitem)
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement