GoFuckYourself.com - Adult Webmaster Forum - Tech DotBot from moz.com is not obeying robots.txt directives

GoFuckYourself.com - Adult Webmaster Forum (https://gfy.com/index.php)

- Fucking Around & Business Discussion (https://gfy.com/forumdisplay.php?f=26)

- - Tech DotBot from moz.com is not obeying robots.txt directives (https://gfy.com/showthread.php?t=1298264)

Cyber Fucker

04-27-2018 06:25 AM

DotBot from moz.com is not obeying robots.txt directives

It was pounding like crazy 1000s of my pages daily, I had not choice but to ban on webserver level.

Besides what good it does? It's just spying on your keywords and links.

Well, even after ban it is still trying to access my sites. (I allowed only robots.txt for it. It reads it and does not give a fuck.) So their shit is obviously broken.

What I’m trying to say is that, perhaps you should review your access logs from time to time and see what is wasting your bandwidth and server resources.

For example, my other observation is that within last 3 years I have received from 500 000 to 1000 000 hacking and exploiting attempts per each live website. While all of them were not successful, it must have some impact on server performance.

Well configured webserver can reduce all this bad stuff by 90-99%.

rowan

04-27-2018 09:55 AM

What directive(s) was the bot ignoring?

Cyber Fucker

04-27-2018 09:58 AM

Code:

User-agent: dotbot

Disallow: /

freecartoonporn

04-27-2018 10:02 AM

disallow all

allow google.

or contact the dotbot guys

Bladewire

04-28-2018 10:55 AM

Have you blocked it by IP? If so , has it come back under other IP's?

TFCash

04-28-2018 11:21 AM

Try this little script out, it seems to grab most of the bad bots.

Bot Black Hole

Cyber Fucker

04-28-2018 12:10 PM

Nah, I've blocked it by user-agent header, I got no problem with it now since it’s blocked.
This thread was more like an educational one and a suggestion for everyone to monitor webserver logs, at least from time to time.
I'm fine. :)

Less resources eaten by bad stuff = more resources for good traffic = speed and speed = better SEO.

rowan

04-28-2018 09:18 PM

Even "good" bots can cause issues.

At one point GoogleBot was fetching 150k+ pages per day from one of my sites. The site is heavily database driven so fetching two pages per second continuously did cause some server load issues.

You can dial back the crawl rate in webmaster tools, but that setting expires after 90 days, and then Googlebot just starts pounding away again. They deliberately ignore the Crawl-Delay robots.txt directive.

All times are GMT -7. The time now is 02:49 PM.