03-12-2014, 03:21 PM
|
|
|
It's 42
Industry Role:
Join Date: Jun 2010
Location: Global
Posts: 18,083
|
Quote:
Originally Posted by adultmobile
I mean your list misses thousands of domain names, are we sure these missing had the yes/no correct? Same goes for results missing in my script, we count only those listed... there is not an hand-made complete correct list to compare from. Anyway I can give my script (is python 3) to everyone who is active affiliate of tubecamgirl, look my signature.
|
"My list" is identical to your list. I just checked those domains with AWIS.
It took 3 min (0.25s sleep) query time and $0.07 cash + some coding time tweaking to just output those values into a file.
"there is not an hand-made complete correct list to compare from" -- of course not -- it is proprietary data.
However:
http://aws.amazon.com/datasets/41740
Here is your new improved dataset ;) FREE (some sorting required )
http://commoncrawl.org/new-crawl-data-available/
Hadoop -- You can be a better Google 
|
|
|