Thread: Robots.txt
View Single Post
Old 10-27-2008, 09:31 AM  
Jdoughs
Confirmed User
 
Jdoughs's Avatar
 
Industry Role:
Join Date: Mar 2004
Location: Great White North
Posts: 5,794
Quote:
Originally Posted by BigPimpCash View Post
I am currently reading up on it... I am a little confuzzled though... I want bots/crawlers to read my site, so does it matter if I dont have a robots.txt file ? It says if I have a robots.txt file thats empty they will all take it that its ok to enter... but in not having one or having a blank one does that mean they are less likely to crawl the site ?

I understand however some bots are undesirable... where is there a list of the bots that are not wanted ???

Someone said about bots that some are bad bots that for example harvest emails, and others that are site rippers... but it appears these are more blockable by re writing your HTACESS file as opposed to editing the robots.txt file... but again I would ask where you find a list of bad bots/site rippers that is up to date to put into your HTACCESS
If you want them to spider everything, dont add one. If you feel you need to, add a general one allowing all bots, if you have a specific bot problem, ban that one bot. You can also use it to ban certain directories.

To allow all robots complete access:

User-agent: *
Disallow:


To exclude all robots from the server:

User-agent: *
Disallow: /

To exclude all robots from parts of a server:

User-agent: *
Disallow: /private/
Disallow: /images-saved/
Disallow: /images-working/

To exclude a single robot from the server:

User-agent: Named Bot
Disallow: /
__________________
LinkSpun - Premier Adult Link Trading Community - ICQ - 464/\281/\250
Be Seen By New Webmasters/Affiliates * Target out webmasters/affiliates based on niches your sites are for less than $20 a month.
AmeriNOC - Proudly hosted @ AmeriNOC!
Jdoughs is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote