HomebulletScriptsbulletTag: crawl (18 results)
  1. Caterpillar
    317 total visits
    Caterpillar is a PHP script for screen scraping which can be used to crawl various remote websites and scrape for desired information. It's different from other screen scrapers because it uses parallel requests for speeding up the parsing process.
  2. PHP Scanner
    450 total visits
    This class can be used to crawl a site to find PHP errors.It can retrieve the pages and images of a site and follow the links recursively to determine if any pages exhibit error messages generated due to PHP errors.The class displays an error message for any pages which are detected to have errors.
  3. No Screenshot
    2327 total visits
    CrawlProtect give you three levels of protection for your website.- First with the help of htaccess rules, Crawlpotect will block code injection attempts, SQL injection attempts, visits coming from crawler known as "Badbots" (crawlers used by hackers), website copier and shell command execution attempts.- The second level, is an easy way to change chmod of your site folders and files; ...
  4. No Screenshot
    2139 total visits
    Bot recognizer and dispatcher is the IP address of the computer or the user agent of the browser currently accessing a Web service can be checked to determine if it is known to be used by search engine robots or malicious Web crawlers.The Bot recognizer and dispatcher can call different callback functions depending on the type of crawler was identified. ...
  5. No Screenshot
    1944 total visits
    GoogleCrawler can send HTTP requests to the Google search pages to perform searches for given keywords.The retrieved search result pages can be parsed and the URLs with associated titles and description snippets can be extracted.GoogleCrawler retrieve a single page of results. The number of page and the number of results per page.Requirements: PHP 5.0 or higher
  6. No Screenshot
    2096 total visits
    Spider website can retrieve a page of a site and follow all links recursively to retrieve all the site URLs. The crawling can be restricted to URLs with a particular extension. It can also avoid accessing pages listed in the site robots.txt file, or pages set with the no index or no follow meta tags. Requirements: PHP 5.0 or higher
  7. No Screenshot
    1715 total visits
    Link Searcher retrieves a given Web page and searches for links contained in it. The new links that are found are added to a queue to be crawled later and so implement recursive searching up to a given depth limit. Link Searcher supports regular expressions. Requirements: PHP 4.0 or higher
  8. No Screenshot
    2358 total visits
    Automap is able to crawl website by following the links found on the page of a given URL. It can build a sitemap XML document with all the URLs of the pages that were found. The number of crawled links, the allowed page extensions and the disallowed link URL parameters are configurable. Requirements: PHP 5 or higher
  9. No Screenshot
    1648 total visits
    hide_mail_link hides the email address away from crawlers. It still generates a clickable link.The code generated is "tidy-proof". hide_mail_link needs a javascript-enabled browser to work.
  10. No Screenshot
    2057 total visits
    Spambot Trap Deluxe can be used to generate pages with e-mail links pointing to spam trap e-mail addresses. It can be used with mod_rewrite conditions to redirect unwanted crawling bots to a different spam trap domain.In the spam trap domain Spambot Trap Deluxe can be used to serve pages with bogus e-mail addresses either in the form of mailto: links ...
  11. No Screenshot
    1827 total visits
    Anti-Spam protects e-mail addresses in Web pages from being harvested usually by spammers that use programs to crawl and scan Web sites.
  12. No Screenshot
    2020 total visits
    The Google search result pages for the given keywords is retrieved. Google Crawler parses the result pages to retrieve the list of search result URLs, titles and excerpts from the pages that were found.The limit number of result pages to be retrieved is configurable.Requirements: PHP 4.0 or higher
  13. No Screenshot
    2083 total visits
    Web Crawler using MySQL DB retrieves a given Web page and parses its HTML content to extract the URLs of links and frames. The URLs that are crawled are stored in a MySQL database table if the URL was not yet stored previously.The list of URLs already stored from a specified domain name can also be displayed on a Web ...
  14. No Screenshot
    1988 total visits
    Robots_txt takes the URL of a page and retrieves the robots.txt file of the same site. The robots.txt is parsed and the rules defined in it are looked up, in order to determine if crawling a page is allowed.Robots_txt also stores the time when a page is crawled to check whether next time another page of the same site is ...
  15. No Screenshot
    2314 total visits
    Spider Class can retrieve Web pages and parse them to extract the list of their links to continue crawling all linked pages.The pages may be retrieved iteratively until it is reached a given limit of pages or link depth.Spider Class is possible to set regular expressions for both link definitions and content matches, changeable at every depth.
Pages 1 of 2« 1 2 »