Quote:
Originally Posted by BigPimpCash
I am currently reading up on it... I am a little confuzzled though... I want bots/crawlers to read my site, so does it matter if I dont have a robots.txt file ? It says if I have a robots.txt file thats empty they will all take it that its ok to enter... but in not having one or having a blank one does that mean they are less likely to crawl the site ?
I understand however some bots are undesirable... where is there a list of the bots that are not wanted ???
Someone said about bots that some are bad bots that for example harvest emails, and others that are site rippers... but it appears these are more blockable by re writing your HTACESS file as opposed to editing the robots.txt file... but again I would ask where you find a list of bad bots/site rippers that is up to date to put into your HTACCESS
|
If you want them to spider everything, dont add one. If you feel you need to, add a general one allowing all bots, if you have a specific bot problem, ban that one bot. You can also use it to ban certain directories.
To allow all robots complete access:
User-agent: *
Disallow:
To exclude all robots from the server:
User-agent: *
Disallow: /
To exclude all robots from parts of a server:
User-agent: *
Disallow: /private/
Disallow: /images-saved/
Disallow: /images-working/
To exclude a single robot from the server:
User-agent: Named Bot
Disallow: /