Quote:
|
Originally Posted by Big Ray
Hey WG.
You make a good point, however, their is a big difference in the google and yahoo spider than the ones that try and pull 200 threads a second all day every day. What we have to contend with are aggressive screen scrapers that feel the need to index every post the second it's made. They are highly aggressive and are not throttled. They behavior more like a DoS attack than a bot.
People ask why we don't ban them. The answer is we try, however, the majority of them use anonymous proxies and I don't mean a few.. I am talking hundreds. When we block those, they get a new batch. It's a constant battle which we fight every day.
The search feature is widely used to attempt to create search notification systems. Some companies do this, and hit the site once a day to see what people are saying about them or "keyword". Others are constantly hitting search for keywords like "need content" or "need hosting". Ever wonder how the same folks always post first to a request for product or service thread? now ya know. lol
Ray
|
You're right, a SE must take into consideration the target server's load, etc. and use common sense and "SE spidering ethics". In this case, you should seek to stop this from happening or work with them to improve their efforts. That would save both sides a lot of time. After all, they are trying to send traffic to GFY (in the case of STBoards) and have no problem having people read the full thread here.