Advertisement
Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- The Big G is to access Google (and Bing by special request only) and does not work with any other sites.
- Your queries to the Big G server are sent by HTTP, whereas your actual queries to Google are sent via HTTPS
- URL's need to be as organic as possible and you do not use unnatural query constructs, e.g. num=100 , or the gl and uule tags (which are incompatible with The Big G). Google is now very sensitive to manipulation of the URL when mass scraping.
- Queries are throttled to a maximum of 1,000 queries per minute.
- Close the TCP/IP connection after each unique keyword query completes otherwise this will severely limit the capabilities of The Big G.
- Cookies can be used but must be disposed of properly after each unique keyword request completes.
- You use random, up-to-date user agents but retain the same user agent for each SERP for each keyword. For example retain user agent A for pages 1, 2, 3... of your first keyword query. Retain user agent B for pages 1, 2, 3... of your next keyword query.
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement