In the first part of our three-part series, we learned what bots are and why crawl budgets are important. Let's take a look at how to let search engines know what's important and some common coding issues.
How to let search engines know what's important
When a robot crawls your site, a number of landmarks redirect it to your files.
Like humans, robots follow links to get an idea of the information on your site. But they also search your code and directories for specific files, tags, and items. Let's take a look at a number of these elements.
The first thing that a bot will look for on your site is your robots.txt file.
For complex sites, a robots.txt file is essential. For smaller sites with just a handful of pages, a robots.txt file may not be necessary – otherwise, search engine robots will simply explore everything on your site.
There are two main ways to guide robots using your robots.txt file.
1. First, you can use the "disallow" directive. This will require bots to ignore uniform resource locators, files, file extensions, or even entire sections of your site:
Disable: / example /
Although the disallow directive prevents bots from exploring certain parts of your site (thus saving on the crawl budget), this will not necessarily prevent the pages from being indexed and displayed in search results:
The Enigmatic and Unnecessary Message "No information is available for this page" is something you will want to see in your search listings.
The above example occurred because of this prohibiting directive in census.gov/robots.txt:
Disable: / cgi-bin /
2. Another method is to use the directive noindex . If you do not index a certain page or file, this will not prevent it from being explored. However, this will prevent it from being indexed (or removed from the index). This robots.txt directive is unofficially supported by Google, and is not supported by Bing at all (so be sure to have a User-agent: * set of refusals for Bingbot and other bots other than Googlebot ):
Noindex: / example /
Disable: / example /
Obviously, since these pages are …
[Read the full article on Search Engine Land.]
The opinions expressed in this article are those of the guest author and not necessarily Marketing Land. The authors of the staff are listed here.