Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- NAME:
- Snoopy - the PHP net client v2.0.0
- SYNOPSIS:
- include "Snoopy.class.php";
- $snoopy = new Snoopy;
- $snoopy->fetchtext("http://www.php.net/");
- print $snoopy->results;
- $snoopy->fetchlinks("http://www.phpbuilder.com/");
- print $snoopy->results;
- $submit_url = "http://lnk.ispi.net/texis/scripts/msearch/netsearch.html";
- $submit_vars["q"] = "amiga";
- $submit_vars["submit"] = "Search!";
- $submit_vars["searchhost"] = "Altavista";
- $snoopy->submit($submit_url,$submit_vars);
- print $snoopy->results;
- $snoopy->maxframes=5;
- $snoopy->fetch("http://www.ispi.net/");
- echo "<PRE>\n";
- echo htmlentities($snoopy->results[0]);
- echo htmlentities($snoopy->results[1]);
- echo htmlentities($snoopy->results[2]);
- echo "</PRE>\n";
- $snoopy->fetchform("http://www.altavista.com");
- print $snoopy->results;
- DESCRIPTION:
- What is Snoopy?
- Snoopy is a PHP class that simulates a web browser. It automates the
- task of retrieving web page content and posting forms, for example.
- Some of Snoopy's features:
- * easily fetch the contents of a web page
- * easily fetch the text from a web page (strip html tags)
- * easily fetch the the links from a web page
- * supports proxy hosts
- * supports basic user/pass authentication
- * supports setting user_agent, referer, cookies and header content
- * supports browser redirects, and controlled depth of redirects
- * expands fetched links to fully qualified URLs (default)
- * easily submit form data and retrieve the results
- * supports following html frames (added v0.92)
- * supports passing cookies on redirects (added v0.92)
- REQUIREMENTS:
- Snoopy requires PHP with PCRE (Perl Compatible Regular Expressions),
- and the OpenSSL extension for fetching HTTPS requests.
- CLASS METHODS:
- fetch($URI)
- -----------
- This is the method used for fetching the contents of a web page.
- $URI is the fully qualified URL of the page to fetch.
- The results of the fetch are stored in $this->results.
- If you are fetching frames, then $this->results
- contains each frame fetched in an array.
- fetchtext($URI)
- ---------------
- This behaves exactly like fetch() except that it only returns
- the text from the page, stripping out html tags and other
- irrelevant data.
- fetchform($URI)
- ---------------
- This behaves exactly like fetch() except that it only returns
- the form elements from the page, stripping out html tags and other
- irrelevant data.
- fetchlinks($URI)
- ----------------
- This behaves exactly like fetch() except that it only returns
- the links from the page. By default, relative links are
- converted to their fully qualified URL form.
- submit($URI,$formvars)
- ----------------------
- This submits a form to the specified $URI. $formvars is an
- array of the form variables to pass.
- submittext($URI,$formvars)
- --------------------------
- This behaves exactly like submit() except that it only returns
- the text from the page, stripping out html tags and other
- irrelevant data.
- submitlinks($URI)
- ----------------
- This behaves exactly like submit() except that it only returns
- the links from the page. By default, relative links are
- converted to their fully qualified URL form.
- CLASS VARIABLES: (default value in parenthesis)
- $host the host to connect to
- $port the port to connect to
- $proxy_host the proxy host to use, if any
- $proxy_port the proxy port to use, if any
- proxy can only be used for http URLs, but not https
- $agent the user agent to masqerade as (Snoopy v0.1)
- $referer referer information to pass, if any
- $cookies cookies to pass if any
- $rawheaders other header info to pass, if any
- $maxredirs maximum redirects to allow. 0=none allowed. (5)
- $offsiteok whether or not to allow redirects off-site. (true)
- $expandlinks whether or not to expand links to fully qualified URLs (true)
- $user authentication username, if any
- $pass authentication password, if any
- $accept http accept types (image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, */*)
- $error where errors are sent, if any
- $response_code responde code returned from server
- $headers headers returned from server
- $maxlength max return data length
- $read_timeout timeout on read operations (requires PHP 4 Beta 4+)
- set to 0 to disallow timeouts
- $timed_out true if a read operation timed out (requires PHP 4 Beta 4+)
- $maxframes number of frames we will follow
- $status http status of fetch
- $temp_dir temp directory that the webserver can write to. (/tmp)
- $curl_path system path to cURL binary, set to false if none
- (this variable is ignored as of Snoopy v1.2.6)
- $cafile name of a file with CA certificate(s)
- $capath name of a correctly hashed directory with CA certificate(s)
- if either $cafile or $capath is set, SSL certificate
- verification is enabled
- EXAMPLES:
- Example: fetch a web page and display the return headers and
- the contents of the page (html-escaped):
- include "Snoopy.class.php";
- $snoopy = new Snoopy;
- $snoopy->user = "joe";
- $snoopy->pass = "bloe";
- if($snoopy->fetch("http://www.slashdot.org/"))
- {
- echo "response code: ".$snoopy->response_code."<br>\n";
- while(list($key,$val) = each($snoopy->headers))
- echo $key.": ".$val."<br>\n";
- echo "<p>\n";
- echo "<PRE>".htmlspecialchars($snoopy->results)."</PRE>\n";
- }
- else
- echo "error fetching document: ".$snoopy->error."\n";
- Example: submit a form and print out the result headers
- and html-escaped page:
- include "Snoopy.class.php";
- $snoopy = new Snoopy;
- $submit_url = "http://lnk.ispi.net/texis/scripts/msearch/netsearch.html";
- $submit_vars["q"] = "amiga";
- $submit_vars["submit"] = "Search!";
- $submit_vars["searchhost"] = "Altavista";
- if($snoopy->submit($submit_url,$submit_vars))
- {
- while(list($key,$val) = each($snoopy->headers))
- echo $key.": ".$val."<br>\n";
- echo "<p>\n";
- echo "<PRE>".htmlspecialchars($snoopy->results)."</PRE>\n";
- }
- else
- echo "error fetching document: ".$snoopy->error."\n";
- Example: showing functionality of all the variables:
- include "Snoopy.class.php";
- $snoopy = new Snoopy;
- $snoopy->proxy_host = "my.proxy.host";
- $snoopy->proxy_port = "8080";
- $snoopy->agent = "(compatible; MSIE 4.01; MSN 2.5; AOL 4.0; Windows 98)";
- $snoopy->referer = "http://www.microsnot.com/";
- $snoopy->cookies["SessionID"] = 238472834723489l;
- $snoopy->cookies["favoriteColor"] = "RED";
- $snoopy->rawheaders["Pragma"] = "no-cache";
- $snoopy->maxredirs = 2;
- $snoopy->offsiteok = false;
- $snoopy->expandlinks = false;
- $snoopy->user = "joe";
- $snoopy->pass = "bloe";
- if($snoopy->fetchtext("http://www.phpbuilder.com"))
- {
- while(list($key,$val) = each($snoopy->headers))
- echo $key.": ".$val."<br>\n";
- echo "<p>\n";
- echo "<PRE>".htmlspecialchars($snoopy->results)."</PRE>\n";
- }
- else
- echo "error fetching document: ".$snoopy->error."\n";
- Example: fetched framed content and display the results
- include "Snoopy.class.php";
- $snoopy = new Snoopy;
- $snoopy->maxframes = 5;
- if($snoopy->fetch("http://www.ispi.net/"))
- {
- echo "<PRE>".htmlspecialchars($snoopy->results[0])."</PRE>\n";
- echo "<PRE>".htmlspecialchars($snoopy->results[1])."</PRE>\n";
- echo "<PRE>".htmlspecialchars($snoopy->results[2])."</PRE>\n";
- }
- else
- echo "error fetching document: ".$snoopy->error."\n";
- COPYRIGHT:
- Copyright(c) 1999,2000 ispi. All rights reserved.
- This software is released under the GNU General Public License.
- Please read the disclaimer at the top of the Snoopy.class.php file.
- THANKS:
- Special Thanks to:
- Peter Sorger <sorgo@cool.sk> help fixing a redirect bug
- Andrei Zmievski <andrei@ispi.net> implementing time out functionality
- Patric Sandelin <patric@kajen.com> help with fetchform debugging
- Carmelo <carmelo@meltingsoft.com> misc bug fixes with frames
Add Comment
Please, Sign In to add comment