Advertisement
sbmonzur

Finding HTML Tags

Mar 17th, 2021
104
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
Python 0.45 KB | None | 0 0
  1. with open('threenewsarticles.txt', 'r', encoding='utf8') as my_file:
  2.     rawData = my_file.read()
  3.     print(rawData)
  4.  
  5.  
  6. #Separating body text from metadata. This code only works if the textfile has one article.
  7.  
  8. articleStart = rawData.find("<div class=\"story-element story-element-text\">")
  9. articlemetaData = rawData[:articleStart]
  10. articleBody = rawData[articleStart:]
  11. print(articlemetaData)
  12. print("*******")
  13. print(articleBody)
  14. print("*******")
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement