Web robots (also known as crawlers or spiders) are programs that traverse the Web automatically, and which are used by search engines to index the Web, or part of it.
Related categories 3
The Web Robots Pages
Information on the robots.txt Robots Exclusion Standard and other articles about writing well-behaved Web robots.
About Search Indexing Robots and Spiders
Search Tools Consulting explains how the search engine programs called "robots" or "spiders" work, and reviews related sites.
ACAP - Automated Content Access Protocol
Standard being developed on behalf of content publishers to communicate permissions information more extensively than is the case with robots.txt. Project documents, implementation and background information.
Bots vs Browsers
This large database lists user agents in categories and distinguishes between robots and browsers.
HTTP User Agent Index
An alphabetical list of user agents and the deployer behind them, compiled by Christoph Rüegg.
List of User-Agents
A searchable database of user-agents with information about their type, purpose and origin.
Search Engine IP Addresses
Lists IP addresses of search engine spiders. Can be searched by IP address. Also links to resources on spiders.
Search Engine Robots and Other User Agents
John A. Fotheringham presents data in tabular form on the robots sent by search engines and other sites to read and index Web pages: their origins, names and IP addresses.
User Agent String
Tool from ASAP Consulting s.r.o. for detailed user agent string analysis using an online form. Includes databases of browsers and robots.
Contains a database of user-agents for crawlers, spiders, browsers; tools for user-agent lookup and tools for user-agent string search.
Last update:August 9, 2012 at 1:36:49 UTC