GoFuckYourself.com - Adult Webmaster Forum - View Single Post - Paysite Webmasters: do you allow members to use mass downloading software (quadsuck)

raymor · 01-01-2005, 12:44 PM

If you decide to block site rippers, there's a bad way to do it,
an OK way to do it, and a way good to do it.
The bad way is to have a .htaccess listing hundreds
of User-Agent strings that may be used by site rippers.
This is bad for two main reasons. #1, most site rippers
let the user change the User-Agent string. Many have
a simple checkbox to make it send the same user agent
as IE6. Though IE6 [B]IS/B] 3 years old already, a lot of
people still use it of course so you can't block it, which means
that you can't block site rippers which can so easily spoof
this user agent, which is most of them. Also your list will
never be complete, so may sites rippers would be unafacted.
Also, each of those user-agent lines is a conditon that
Apache has to evaluate for every single hit.
If the userloads a page with 30 thumbnails Apache has to
go loking through that list for each and every thumbnail,
doingthrousands of comparisons just to load one page.
Performance WILL suffer noticeably if you have any reasonable
amount of traffic whatsoever.

A slightly better way is to apply one of the cardinal rules of
security - disallow everything, then allow only what is OK.
Rather than listing hundreds of user agents (browsers)
that aren't allowed, you just list the 5 or 6 that ARE allowed -
IE, Mozilla and it's variations (including Firefox and Netscape),
Windows Media Player, RealPlayer, Opera, and Safari.
That takes care of the problem of keeping the list up to date
(until you get members using some other browser)
and solves the performace issues. It still leaves you wide open
to spoofing, though, and because it looks not at the problem,
site ripping, but only at the user agent, it's not 100% effective.
for example, IE has site rippin built right in! Add your site
as a favorite, then select off line viewing and you can use IE
itself as a site ripper. Howare you going to block that?
Well, you're going to block that by detecting and acting upon
the act of site ripping itself, rather than on the name of the software. That's the right way to do it, which brings us to option #3:

The best way, probably, is by using Strongbox. Besides having by
far the most sophisticated protection against brute force attack
and password trading, Strongbox also defends against site rippers
by actually detecting and stopping the ripping process -
the following of every link. Additionally it also provides an
enhanced version of method #2 where a user can only
access a page or image on the site by using the same
browser he or she logged in with in the first place.
These two defenses combined are much, much more
effective than naive attemptd based on listing the user-agent
headers sent by known rippers.

For more info in Strongbox, see:
http://www.bettercgi.com/strongbox/

01-01-2005, 12:44 PM
raymor Confirmed User Join Date: Oct 2002 Posts: 3,745	If you decide to block site rippers, there's a bad way to do it, an OK way to do it, and a way good to do it. The bad way is to have a .htaccess listing hundreds of User-Agent strings that may be used by site rippers. This is bad for two main reasons. #1, most site rippers let the user change the User-Agent string. Many have a simple checkbox to make it send the same user agent as IE6. Though IE6 [B]IS/B] 3 years old already, a lot of people still use it of course so you can't block it, which means that you can't block site rippers which can so easily spoof this user agent, which is most of them. Also your list will never be complete, so may sites rippers would be unafacted. Also, each of those user-agent lines is a conditon that Apache has to evaluate for every single hit. If the userloads a page with 30 thumbnails Apache has to go loking through that list for each and every thumbnail, doingthrousands of comparisons just to load one page. Performance WILL suffer noticeably if you have any reasonable amount of traffic whatsoever. A slightly better way is to apply one of the cardinal rules of security - disallow everything, then allow only what is OK. Rather than listing hundreds of user agents (browsers) that aren't allowed, you just list the 5 or 6 that ARE allowed - IE, Mozilla and it's variations (including Firefox and Netscape), Windows Media Player, RealPlayer, Opera, and Safari. That takes care of the problem of keeping the list up to date (until you get members using some other browser) and solves the performace issues. It still leaves you wide open to spoofing, though, and because it looks not at the problem, site ripping, but only at the user agent, it's not 100% effective. for example, IE has site rippin built right in! Add your site as a favorite, then select off line viewing and you can use IE itself as a site ripper. Howare you going to block that? Well, you're going to block that by detecting and acting upon the act of site ripping itself, rather than on the name of the software. That's the right way to do it, which brings us to option #3: The best way, probably, is by using Strongbox. Besides having by far the most sophisticated protection against brute force attack and password trading, Strongbox also defends against site rippers by actually detecting and stopping the ripping process - the following of every link. Additionally it also provides an enhanced version of method #2 where a user can only access a page or image on the site by using the same browser he or she logged in with in the first place. These two defenses combined are much, much more effective than naive attemptd based on listing the user-agent headers sent by known rippers. For more info in Strongbox, see: http://www.bettercgi.com/strongbox/ __________________ For historical display only. This information is not current: support@bettercgi.com ICQ 7208627 Strongbox - The next generation in site security Throttlebox - The next generation in bandwidth control Clonebox - Backup and disaster recovery on steroids