GoFuckYourself.com - Adult Webmaster Forum

GoFuckYourself.com - Adult Webmaster Forum (https://gfy.com/index.php)
-   Fucking Around & Business Discussion (https://gfy.com/forumdisplay.php?f=26)
-   -   Robots.txt Fun (https://gfy.com/showthread.php?t=796352)

minusonebit 01-01-2008 09:53 PM

Robots.txt Fun
 
Figured some of you might find this section of the Wikipedia robots.txt file to be somewhat interesting, maybe consider blocking some of these things.

http://en.wikipedia.org/robots.txt

# Crawlers that are kind enough to obey, but which we'd rather not have
# unless they're feeding search engines.

User-agent: UbiCrawler
Disallow: /

User-agent: DOC
Disallow: /

User-agent: Zao
Disallow: /

# Some bots are known to be trouble, particularly those designed to copy
# entire sites. Please obey robots.txt.
User-agent: sitecheck.internetseer.com
Disallow: /

User-agent: Zealbot
Disallow: /

User-agent: MSIECrawler
Disallow: /

User-agent: SiteSnagger
Disallow: /

User-agent: WebStripper
Disallow: /

User-agent: WebCopier
Disallow: /

User-agent: Fetch
Disallow: /

User-agent: Offline Explorer
Disallow: /

User-agent: Teleport
Disallow: /

User-agent: TeleportPro
Disallow: /

User-agent: WebZIP
Disallow: /

User-agent: linko
Disallow: /

User-agent: HTTrack
Disallow: /

User-agent: Microsoft.URL.Control
Disallow: /

User-agent: Xenu
Disallow: /

User-agent: larbin
Disallow: /

User-agent: libwww
Disallow: /

User-agent: ZyBORG
Disallow: /

User-agent: Download Ninja
Disallow: /

FredIsMe 01-01-2008 09:56 PM

I'm surprised that these bots would actually listen.

minusonebit 01-01-2008 10:12 PM

Quote:

Originally Posted by FredIsMe (Post 13589288)
I'm surprised that these bots would actually listen.

Me too, but apparently they do otherwise I doubt wikipedia would put them in there...


All times are GMT -7. The time now is 03:51 PM.

Powered by vBulletin® Version 3.8.8
Copyright ©2000 - 2025, vBulletin Solutions, Inc.
©2000-, AI Media Network Inc123