![]() |
Googlebot... how arrogant can they be?
Google ignores crawl-delay in robots.txt. Instead, they force you to register your site in webmaster tools, so you can set a custom crawl rate. This is then RESET to the default after 90 days, so at that time you have to login again and change it again! How fucking arrogant is that? :321GFY
Other search engines respect crawl-delay, imagine if they all wanted us to create accounts and login every 90 days to stop their bots hammering our servers? :321GFY |
yeah that damn google bot racking up bandwidth.
|
|
Quote:
|
And the point isn't so much what they're doing specifically to my site, more that they're arrogant enough to ignore (defacto?) robots.txt settings that every other major search engine bot respects. The webmaster dashboard robots.txt checker even helpfully points out that each of the crawl-delay lines in my robots.txt are ignored!
|
so google bots are ignoring crawl delay? i have delays as i am updating my sites, i dont want to see google every day on site that i am updating weekly. if crawl delay do not work, it is really sick cause 6 days from week google see my site as static not updated. oh snap
|
Quote:
http://www.google.com/support/webmas...n&answer=48620 My site has 200 million pages so technically googlebot isn't fetching fast enough... at the rate of 120k fetches per day it would take 4 1/2 years to index everything. At this point the benefit of indexing 100% of the site (or at least as much as it's trying to) isn't worth the load it's placing on the server. |
Quote:
|
User-agent: Googlebot
Disallow: / problem solved :) |
Quote:
|
Quote:
To the OP - agreed. The trouble is, they can do what the hell they want with over 70% of the search market. |
What kind of site has 200,000,000 pages?
That's alot of pages |
Quote:
|
Quote:
|
They can be very arrogant.
|
Annoying isnt it? I've had googlebot hit my servers up to 15 times per second, for hours at a time. Dynamic pages really make googlebot go nuts.
|
They're evil, plain and simple....
|
Quote:
|
woj's solution is good if you want to keep googlebot off your site completely. But are you sure you want to do that?
|
Quote:
Rowan actually founded Google and he's frustrated because his baby is stuck in a loop. Even as the founder, he cannot get support at Google and has to do the same thing as the rest of us. |
Sue them.
|
Quote:
Yeah, I got pushed out by The Man! Fuck them! :1orglaugh |
Quote:
|
What kind of site has 200,000,000 pages?
|
Quote:
|
Quote:
pieces of fake SE spam crap? If you tried to spam the crap out of Google by creating 200 million bogus pages, I'd say you got what you deserved, and really what you asked for. If you pretended to have 200 million pages so that Google would spider you 200 million times, that was your decision. You can't blame Google if you chose to create fake stuff for them to spider. Note the repeated use of "IF" - I'm asking IF that's what you did. |
Quote:
Yeah I was wondering when someone would post this. Bottom line is, if its causing you more problems then its worth just block it. If they are hitting you that hard you should be getting some good traffic because of it, more traffic = more money, Just upgrade the servers. |
I return a 503 page and they seem to respect that. Then I set it so they can crawl during my off peak loads which they seem to do.
Should work and I know this is documented somewhere on Google's FAQ just can't seem to find the link right now. |
Quote:
raymor: not useless spam, it's all genuine profiling of... domains. :error |
February 22, 2010
New crawl rate: Custom rate 1.000 requests per second 1.000 seconds per request This new crawl rate will stay in effect for 90 days. Funny, 2 weeks later googlebot is still requesting 120k+ pages per day, which is about 150% the rate of the above setting. Their webmaster tools system also sent me a notification encouraging me to increase the rate so they can fetch more pages. Looks like their bot is doing it anyway. :321GFY |
bad bad google :)
|
| All times are GMT -7. The time now is 10:48 AM. |
Powered by vBulletin® Version 3.8.8
Copyright ©2000 - 2026, vBulletin Solutions, Inc.
©2000-, AI Media Network Inc123