HeavyBoy    
HeavyBoy Web and Photoshop Tips HeavyBoy

 
 
 

  Web Tips - Robots.txt Tutorial

 
<< Go Back A Page
 

By using a robots.txt file you your site, you can tell search engine spiders and bots to index or ignore certain parts of your site. There are other types of bots and spiders, but most of us just care about the search engine type.

Few important things to note:

  1. There are bad bots out there that will disguise themselves as a browser and not follow your robots.txt rules either. If you want to forbid all possible access to a certain area, use a .htaccess file or similar.

  2. Your "robots.txt" file must be put in your root folder. In other words, the same folder that your homepage is placed.

Example 1: Allow all robots to access all areas:

User-agent: *
Disallow:

#Copy the orange, bold, indented text above. Save as robots.txt.
#Upload to your the root folder of your website.

Example 2: Deny all robots to access:

User-agent: *
Disallow: /

#Copy the orange, bold, indented text above. Save as robots.txt.
#Upload to your the root folder of your website.

Example 3: Allow all robots to access your site except for the /cgi-bin folder:

User-agent: *
Disallow: /cgi-bin/

#Copy the orange, bold, indented text above. Save as robots.txt.
#Upload to your the root folder of your website.

You can also specify individual rules for individual bots. A list of bots can be found by entering "search engine bots" (without quotes into Google).

Command Breakdown:

User-agent: * = This is telling all incoming robots that this rule applies to them.
User-agent: Googlebot = This is specifically addressing Google's Bot
Disallow: = This essentially says, allow everything, since nothing is specified.
Disallow: / = This means, do not spider at the root level, or anything deeper.
Disallow: /images/ = This means, do not spider the entire images folder.

 
 
HeavyBoy
 
 
BounceFish Website Design

BounceFish

 

  HeavyBoy

Designed by BounceFish Atlanta Web Design