View Single Post
Old 01-05-2013, 07:07 AM  
Barry-xlovecam
It's 42
 
Industry Role:
Join Date: Jun 2010
Location: Global
Posts: 18,083
Longstanding problem

her are a few links I found that looked interesting
http://www.wizcrafts.net/exploited-s...blocklist.html

http://antiscraper.com/

From my experience it is ''wack-a-mole'' you need the time or manpower to grep the webserver logs looking for page requests ( or head request ) with no requests for images -- that is a dead give away. Then you have to be careful not to disallow legitimate search engine bots.

you can look for unusual activity ...

using SSH with permissions (in user or root depends on the log's location);

Quote:
tail -5000 access.log | awk '{print $1}' | sort | uniq -c | sort -n | tail >tofilename
tail -5000 access.log | awk '{print $1}' | sort | uniq -c | sort -r | tail >tofilename
grep 'GET' access.log | cut -d' ' -f1 | sort | uniq -c | sort -r >tofilename
Welcome to the dark side ...

Barry-xlovecam is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote