Guest User

Untitled

a guest
Jan 16th, 2019
89
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 0.61 KB | None | 0 0
  1. def get_document_emails(pdf_format_content):
  2. """ The function runs through the document and extracts all the email addresses it finds.
  3. This method returns an ordered list of document emails without repeating. """
  4. new_list = []
  5. document_emails = re.findall(r'[w.w]*@w*.[w.w]*', pdf_format_content)
  6. for i in document_emails:
  7. if i not in new_list and not str(i).endswith('.'):
  8. new_list.append(i)
  9. return sorted(new_list)
  10.  
  11. def get_document_provider(pdf_format_content):
  12. """ Return the name of the provider """
  13. return re.match(r'(PROVEEDOR:)+(.*?\n)', pdf_format_content)
Add Comment
Please, Sign In to add comment