Advertisement
Guest User

Untitled

a guest
May 1st, 2017
584
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 2.04 KB | None | 0 0
  1. # Challenge:
  2.  
  3. Create a command line program that will take an internet domain name (i.e. “jana.com”) and print out a list of the email addresses that were found on that website only.
  4.  
  5.  
  6. ## Example:
  7. The following is *expected* output from jana.com and web.mit.edu, but it should also run on other websites. In the example of jana.com, the program should not crawl other subdomains (blog.jana.com, technology.jana.com).
  8.  
  9. ```
  10. # These are expected output from www.jana.com
  11. > python find_email_addresses.py www.jana.com
  12. Found these email addresses:
  13. sales@jana.com
  14. press@jana.com
  15. info@jana.com
  16.  
  17. # Here are some examples from web.mit.edu (subject to change)
  18. > python find_email_addresses.py mit.edu
  19. Found these email addresses:
  20. campus-map@mit.edu
  21. mitgrad@mit.edu
  22. sfs@mit.edu
  23. llwebmaster@ll.mit.edu
  24. webmaster@ll.mit.edu
  25. whatsonyourmind@mit.edu
  26. fac-officers@mit.edu
  27. ```
  28.  
  29. ## More information:
  30.  
  31. - You can use any modern programming language you like. We work in Python and Java, so one of those is preferred but not required.
  32. - Create a new github repository for this project. The repository should be public but please give it some kind of codename that doesn't have the word `jana` in it. The master branch should be empty, and then create a branch with your code in it.
  33. - Push your branch up to github, and create a pull request. Send me the link to the pull request, and I can comment directly on it. All our code goes through this code review process, so it's a little glimpse into how we work.
  34. - In your repo, please include a readme that has any instructions we might need to setup and install your solution.
  35. - Your program must work on another computer, so be sure to include any required libraries (using libraries is OK). You do not need to check in the source for those libraries. Build scripts and/or a requirements.txt file would be preferred.
  36.  
  37. ## Hints:
  38.  
  39. - Make sure to find email addresses on any discoverable page of the website, not just the home page.
  40.  
  41. ## Style:
  42.  
  43. - At Jana we follow the Google Style Guides for Python and Java. However, it is not critical for this challenge.
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement