Sometimes there are pages in your site that you simply do not want or need indexed. This could be because you’ve got directories that don’t need to be indexed, or it could simply be because you don’t want visitors landing on certain pages.
One good reason to exclude certain directories is to help out the search engines. Think about it – they have a lot of work to do to completely index your entire site. This causes traffic on the internet, on your ISP and your host. Anything that you can do to help reduce this traffic will help the greater good.
You can tell search engines what data to exclude by creating a Robots.Txt file. This is a simple text file in your root level directory which contains some keywords and file specifications to be ignored. An example is shown below.
User-agent: * Disallow: /images/ Disallow: /banners/ Disallow: /Forms/ Disallow: /Dictionary/ Disallow: /_borders/ Disallow: /_fpclass/ Disallow: /_overlay/ Disallow: /_private/ Disallow: /_themes/
The "User-agent"
keyword is virtually always set to "*"
, indicating this applies to all search engines. The "Disallow"
keywords are simply lists of directories and files to be excluded. Note that "/se"
will match any directory beginning with "/se"
while "/se/"
will only match a directory named "/se/"
. You may also include filenames after the directories, for example "/se/myfile.htm"
.
It is important to remember that the Robots.Txt
file is available to everyone. Thus you never want to specify the names of sensitive files or folders. If you must exclude them, it is better to use password protected pages which cannot be reached by search engines at all (they don’t have the password!)