there's a lot of this discussed over on
Webmaster World News. do a search on "perfect .htaccess". there's some folks over there who have coded up some routines to monitor bytes downloaded over time and dump them if it's acting like a ripper instead of a surfer. covers the not having the user agent in your .htaccess or spoofed agent problem. actually they've got quite a few neat little ideas over there ...
