Advertisement
Guest User

Untitled

a guest
Jul 25th, 2015
602
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 1.16 KB | None | 0 0
  1. Why are some sites harder to archive than others?
  2.  
  3. If you look at our collection of archived sites, you will find some broken pages, missing graphics, and some sites that aren't archived at all. Here are some things that make it difficult to archive a web site:
  4.  
  5. Robots.txt -- We respect robot exclusion headers.
  6. Javascript -- Javascript elements are often hard to archive, but especially if they generate links without having the full name in the page. Plus, if javascript needs to contact the originating server in order to work, it will fail when archived.
  7. Server side image maps -- Like any functionality on the web, if it needs to contact the originating server in order to work, it will fail when archived.
  8. Unknown sites -- The archive contains crawls of the Web completed by Alexa Internet. If Alexa doesn't know about your site, it won't be archived. Use the Alexa Toolbar (available at www.alexa.com), and it will know about your page. Or you can visit Alexa's Archive Your Site page at http://pages.alexa.com/help/webmasters/index.html#crawl_site.
  9. Orphan pages -- If there are no links to your pages, the robot won't find it (the robots don't enter queries in search boxes.)
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement