01-05-2013, 07:07 AM
|
|
It's 42
Industry Role:
Join Date: Jun 2010
Location: Global
Posts: 18,083
|
Longstanding problem
her are a few links I found that looked interesting
http://www.wizcrafts.net/exploited-s...blocklist.html
http://antiscraper.com/
From my experience it is ''wack-a-mole'' you need the time or manpower to grep the webserver logs looking for page requests ( or head request ) with no requests for images -- that is a dead give away. Then you have to be careful not to disallow legitimate search engine bots.
you can look for unusual activity ...
using SSH with permissions (in user or root depends on the log's location);
Quote:
tail -5000 access.log | awk '{print $1}' | sort | uniq -c | sort -n | tail >tofilename
tail -5000 access.log | awk '{print $1}' | sort | uniq -c | sort -r | tail >tofilename
grep 'GET' access.log | cut -d' ' -f1 | sort | uniq -c | sort -r >tofilename
|
Welcome to the dark side ...
|
|
|