Welcome to the GoFuckYourself.com - Adult Webmaster Forum forums.

You are currently viewing our boards as a guest which gives you limited access to view most discussions and access our other features. By joining our free community you will have access to post topics, communicate privately with other members (PM), respond to polls, upload content and access many other special features. Registration is fast, simple and absolutely free so please, join our community today!

If you have any problems with the registration process or your account login, please contact us.

Post New Thread Reply

Register GFY Rules Calendar
Go Back   GoFuckYourself.com - Adult Webmaster Forum > >
Discuss what's fucking going on, and which programs are best and worst. One-time "program" announcements from "established" webmasters are allowed.

 
Thread Tools
Old 01-05-2013, 04:45 AM   #1
Captcha
Confirmed User
 
Captcha's Avatar
 
Join Date: Aug 2009
Posts: 1,375
Bad Robots, site rippers

I am sick with bad robots, site rippers etc
Lots and lots of "noref" hits in ATX!......... I just launch a new site and get fucked with this shit again.... has anyone a list up to date like this one to block this shit?

http://www.javascriptkit.com/howto/htaccess13.shtml
Captcha is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 01-05-2013, 04:55 AM   #2
JamesM
Confirmed User
 
Industry Role:
Join Date: Nov 2012
Posts: 732
bots are smart these days, they use fake user agent.
anyways why are worried, is it because of bandwidth ? just curious.

popular sites seems to get this.
__________________


Ex GF Films | Grab Dollars
Up To 80% Rev-Share | 255 Day Cookie | Legal Content | Variety of Promo Tools | CCBill Program | GF Niche
james[at]grabdollars[dot]com | ICQ::611-99-zero-zero-20
JamesM is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 01-05-2013, 07:07 AM   #3
Barry-xlovecam
It's 42
 
Industry Role:
Join Date: Jun 2010
Location: Global
Posts: 18,083
Longstanding problem

her are a few links I found that looked interesting
http://www.wizcrafts.net/exploited-s...blocklist.html

http://antiscraper.com/

From my experience it is ''wack-a-mole'' you need the time or manpower to grep the webserver logs looking for page requests ( or head request ) with no requests for images -- that is a dead give away. Then you have to be careful not to disallow legitimate search engine bots.

you can look for unusual activity ...

using SSH with permissions (in user or root depends on the log's location);

Quote:
tail -5000 access.log | awk '{print $1}' | sort | uniq -c | sort -n | tail >tofilename
tail -5000 access.log | awk '{print $1}' | sort | uniq -c | sort -r | tail >tofilename
grep 'GET' access.log | cut -d' ' -f1 | sort | uniq -c | sort -r >tofilename
Welcome to the dark side ...

Barry-xlovecam is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Post New Thread Reply
Go Back   GoFuckYourself.com - Adult Webmaster Forum > >

Bookmarks



Advertising inquiries - marketing at gfy dot com

Contact Admin - Advertise - GFY Rules - Top

©2000-, AI Media Network Inc



Powered by vBulletin
Copyright © 2000- Jelsoft Enterprises Limited.