In Partnership with AOL Search

DMOZ: Open Directory Project

Who is "Robozilla"?

Robozilla is a webcrawler that periodically visits all the websites pointed to by the Open Directory Project. When a page has either moved or is not found, Robozilla notes this fact in the Open Directory. There can be many reasons a page isn't found, and you might take different actions depending on the reason.

What should I do about Robozilla Errors?

Your first job is to become Sherlock Holmes and figure out why the URL isn't responding. Below is a list of common reasons for errors and actions you might take.

ProblemSolution
There is a typo in the URL. An editor typed in the URL incorrectly. Simply fix the URL. Try adding (or removing) common endings, such as default.html, default.htm, default.asp, index.html, index.htm, index.cgi, index.mv, index.asp, index.php, main.html, main.htm, etc.
The URL works for me. The server may have been down when Robozilla crawled the site. After verifying that the content of the site corresponds to the title and description of the listing, click the "URL works for me: Clear Error" button.
The URL is dead. Maybe the page moved. Go look for it. Try searching Google; they may have cached a version of the site with a pointer to the new location. Also, try searching with Bing or another search engine for the site title or some keywords specific to that site to see if there's an updated URL available.
The URL is dead, I looked for it, I have no idea where it is now. Leave it in the unreviewed section of your category and if it is still dead next week, and doesn't contain unique information, delete it. If the site is unique to the category, you can leave it in place, to recheck after several weeks.

What do the Error Codes mean?

There are two types of error codes that are presented in the open directory. The first set are positive numbers greater than 100. These numbers are Errors in the HTTP protocol. If you see one of these numbers, Robozilla was able to talk to the server but not able to get at your document for some reason or another.

Error codes with values less than 0 are errors that occurred while trying to talk to the server (but not succeeding). This can by due to typos in the URL, a bad network connection or an overloaded or down server. Links are checked twice before they are marked with an error, to try to account for machines that are temporarily down.

Following are the most common error codes generated by Robozilla. A list of all error codes is also available, in case you come across one not listed, or you're just curious.

Code Meaning
500 Server error Sometimes happens due to a server misconfiguration. Usually this is transient, and goes away, but check first.
410 Gone The resource doesn't exist at all. The page was removed by the webmaster.
404 Not Found The resource doesn't exist on this server. The page may have been removed, or the site has restructured and the content now resides at a different URL.
403Forbidden You can't see this resource on the server. The admin may have turned off the pages because of load or for some other reason.
401Unauthorized You can't see this resource on the server. Perhaps a password is now required, or the resource you're looking for has moved.
400Bad Request Usually occurs due to a space in the URL or other malformed URL syntax. Try converting spaces to %20 and see if that fixes the error.
302Redirect Temporarily The page has a new URL temporarily (in theory; in practice, this is often used as a synonym for code 301). Update the listing to the new URL.
301Redirect Permanently The page has a new URL. Update the listing to the new URL.
0 Unknown error Probably a DNS error.
-1 Unable to Resolve Host Probably a typo in the host name or they didn't pay their bill for their Domain name.
-4 Can't connect We can't connect to the HTTP server. The server is there but it didn't want to talk to Robozilla on the specified port.
-5 Timeout Robozilla connected OK, and sent the request but Robozilla timed out waiting to fetch the page. This happens sometimes on really busy servers.
-6 Bad URL There was a problem with the format of the URL. Perhaps http:// is missing? Note: when you click on the [edit] link for a URL missing http://, the edit page for the site comes up with http:// helpfully attached by the dmoz software, and the URL link therefore works. You still have to click Update (not "This URL works for me") in order for the update to take effect.
-7 Server ErrorThe server returned an unknown error code, and is probably mis-configured. The page may still show up okay, but it's a good idea to check it just in case.
-8 Domain Name ExpiredThe domain registration expired; the domain may be non-functional, or it may be parked (displaying a page of generic advertising links) or hijacked (claimed by a new owner, who is using the site for a different purpose, e.g. advertising or search engine optimization) – or it may be reclaimed by its previous owner. Take care to review the whois information for sites marked with this code (e.g., using domaintools.com), and make sure the content is still up to date as described in the listing before republishing.
Mozilla and Trashcan