View Single Post
Old 01-21-2005, 12:14 AM  
dcortez
DINO CORTEZ™
 
dcortez's Avatar
 
Industry Role:
Join Date: Jun 2003
Location: Vancouver Island
Posts: 2,145
Quote:
Originally Posted by FightThisPatent
nope. and trying to block our web crawlers isn't quite the answer for many reasons:
Well there you go. This is tool intended for covert information gathering by design.

Google gets to 'look up my skirt' because it gives me something back (not as much since last December, but we're working on that).

Altavista gets a peek, as does MSN and many other spiders which merit access to my sites.

Regardless of what your spidering is about, I will not be the only one blocking your access. Unless I order a report, there is absolutely no value in permitting and/or aiding you in the assembly of your 'interpretation' of my outbound links.

If anything, aiding you in your compilation of information about where I send my traffic will do much more potential harm than any possible good where my (site) interests are concerned.

If anyone who realized that, by permitting an agent such as yours, to collect outbound link info for purposes of publishing to sponsors who could (at their discretion) use that information to potentially impact their affiliate efforts in a negative way, banned your 'army of snoopbots' as an act of prudence, your reports will lose relevance and meaning at a binary rate of progression.

I ban most site scraping agents to cut costs on bandwith. If I believed that an agent was collecting info that could hit my bottom line, that agent would be banned with extreme prejudice.

Quote:
Originally Posted by FightThisPatent
there are many, many other web crawlers out there so you'd be spending alot of time tracking them all down and we spider from many different IPs
If you're relying on the aggregate of other SE 'links:domain.com' for the accuracy of your report, well, good luck on convincing others that your report is comprehensive. Anyone checked Alexa or Google lately?

Tracking and documenting your 'footprint(s)' is a technical no brainer (even if you pretend to be IE through multiple proxies and tread softly on your tippiest of tippy toes) and over a short period of time, your agents WILL be registered with all their IPs in the respective Bad Agent registries.

You can use proxies, run from different servers or even desktop apps in basements and garages across the globe, but 'patterns are patterns'.

Quote:
Originally Posted by FightThisPatent
the t3report is based on linking relationships, so websites that link to your domain would still be accessed and have links that connect to you. being able to block our spiders could mean that the external links that you have that connect to the target domain of the t3report could be blocked, and thus, your domain never shows up in the report.. but other web crawlers that do access your site could be harvested to build up the missing links that end up connecting you into the report.
Correct. But if those other websites realize as do I that, unless I actually want the report it's best to block your 'agents', your chain of links is going to quickly scatter apart.

Quote:
Originally Posted by FightThisPatent
anyone can go to alexa or google and type in link:domain.com where domain.com is your domain and be able to pull up links to your domain, so you can't really hide
It's not about hiding - it's about understanding what, after all your independant agents have been blocked by paraoid site owners who may trust you but not those you furnish their info to, value does your report have if it is just an aggregate of Google and Alexa and AllTheWeb and whatever other free public reporting systems already exist.

Quote:
Originally Posted by FightThisPatent
by having your website in the report, it could actually gain business for you if people see you have good traffic leading to you.. why chase down each and every linker to you? easier to just tap into your traffic by buying ad space on your site, link exchange, or entice to be an affiliate manger.
Those whom I link to already know my value - it's called affiliate stats (and sales).

Those who link to me, and more importantly send traffic are known as well through a thing called 'web logs'. In fact, the actual traffic a site sends me (available free in my web logs) is more important than any (promising) links they may have to me.

Quote:
Originally Posted by FightThisPatent
if you are getting bad traffic and passing it on, then having that revealed could be bad for you, but bad traffic is bad traffic, there is no defense for that..
Do we get spanked too?

Quote:
Originally Posted by FightThisPatent
Trying to block out our spiders goes again to my point of chopping off your nose to spite your face
I'm not underestimating the challenges of nailing malicious bots and spiders. They can be a royal pain in the butt and sometimes the best way to deal with them is by dealing with the company which is doing the unwelcomed attacking. I don't know about your site TOS, but mine expressly prohibits bad agents and sometimes technology is not the only remedy for annoying pests.

Leave my nose alone - I like it!

It tells me when something smells 'off' and right now something smells fishy about this 'tool' (I did wash my hands before typing this).

But hey, what do I know?

-Dino
dcortez is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote