View Single Post
Old 01-24-2010, 07:59 PM  
raymor
Confirmed User
 
Join Date: Oct 2002
Posts: 3,745
For Throttlebox we graphed the derivative of IP versus hits.
The knee in the graph clearly shows normal versus abusive behaviour.
Alternatively, slight less accurate is to graph to top X IPs where
X is high enough to show the knee. Here's an example of that from
Throttlebox:

https://bettercgi.com/throttlebox/ma...oosing_limits/

You see in the graph that the second highest, third highest etc. are roughly
linear with a near horizontal slope. That's indicative of normal usage. In the
case of the graph illustrated, only the #1 top user is far from being linear with
the others - that's the abusive one.

For another example, let's say the graph looked like this:

Code:
20 #
19 #
18 ##
17 ###
16 ###
15 ###
14 ####
13 ##### 
12 #####
11 #####
10 #####
9  ######
8  ######
7  ########
6  #########
5  ############
4  ##############
3  #####################################################
2  #################################################################
1  ####################################################################
In this ASCII graph, the top three are way out of line from the others, which is
indicative of abuse.

That will tell you where the cut off line should be, but that's the easy part.
There are much more difficult issues to work out before you have something
truly effective.

You have to be careful since you're working with IPv4 addresses.
You should expect that AOLs proxies and DTAGs proxies, for example, are going to
have a LOT more hits than any normal IP, on a site with a broad user base.
If the site has 12 AOL users on at different times of the same day, six of those
users may show up as the same IP.

On the other side, an cracker going through a zombie web server may use all sixteen
IPs on that server, so you really want to look at ranges of IPs as well.

I know I'm throwing a lot out there at you, but only because there are a lot of things
to consider.
We've been working on "detect and stop abuse" for a decade and half and still need
to do updates all the time in order to remain optimally effective.
__________________
For historical display only. This information is not current:
support@bettercgi.com ICQ 7208627
Strongbox - The next generation in site security
Throttlebox - The next generation in bandwidth control
Clonebox - Backup and disaster recovery on steroids
raymor is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote