Welcome to the GoFuckYourself.com - Adult Webmaster Forum forums.

You are currently viewing our boards as a guest which gives you limited access to view most discussions and access our other features. By joining our free community you will have access to post topics, communicate privately with other members (PM), respond to polls, upload content and access many other special features. Registration is fast, simple and absolutely free so please, join our community today!

If you have any problems with the registration process or your account login, please contact us.

Post New Thread Reply

Register GFY Rules Calendar
Go Back   GoFuckYourself.com - Adult Webmaster Forum > >
Discuss what's fucking going on, and which programs are best and worst. One-time "program" announcements from "established" webmasters are allowed.

 
Thread Tools
Old 04-08-2017, 10:59 AM   #1
johnnyloadproductions
Account Shutdown
 
Industry Role:
Join Date: Oct 2008
Location: Gone
Posts: 3,611
GFY parsed profile data dump (nothing sensitive, just for the curious), and how I did it.

There's no sigs or anything that can be spammed in the database or csv files in the download link. Simply for those that are curious, image of what to expect for sql and csv.

https://www.dropbox.com/sh/3shozxdxs...qRDg37Sda?dl=0



Over 250k rows in the uses_view.csv, might take a while to load.

Profiles aren't like general threads, I had to have a valid session to visit a persons url using my account. That means for the time being, just about every account before 2014 has a recent visitor listed as "johnnyloadproductions."

I used python and the selenium webdriver along with pyvirtualdisplay to use iceweasel on a Raspberry Pi to get the profile data. This ran for about a month in the background. A cronjob would fire up the script that would then direct iceweasel (basically firefox) to got to "profilexxxx.html".
GFY friendly urls but fortunately I can still bot through profile numbers in order and go through them all.

Writing the robust parser took several hours but ended up working pretty well. It's just python with beautifulsoup (a text parsing library), and pymysql to talk to a mysql db.
Took 8 hours to parse.

I'm pretty sure someone has done this in the past for a webmaster spam campaign.

Bots are nice and can simplify tasks for you, thought I would share this information with all of you.
In general it is nice if a service has an API but if they don't you can actually do something similar to this to: upload videos or images, make posts at scheduled times, bypass captchas, etc.
johnnyloadproductions is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 04-08-2017, 02:45 PM   #2
fris
Too lazy to set a custom title
 
fris's Avatar
 
Industry Role:
Join Date: Aug 2002
Posts: 54,493
doubt they like people scraping the site and causing heavy loads ;)
__________________
Since 1999: 69 Adult Industry awards for Best Hosting Company and professional excellence.


my contact: fris at fris.net
fris is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 04-08-2017, 04:28 PM   #3
woj
<&(©¿©)&>
 
woj's Avatar
 
Industry Role:
Join Date: Jul 2002
Location: Chicago
Posts: 47,883
what you are doing is a bit odd, all this drama so you can post your results on gfy and get 5 "interesting stats" replies? you are probably bored or trying to fine tune your skills, but why not crawl and parse something that has some value? like for example some tube site to discover most common keywords/niches/paysites/models/etc? that data would be 1000x more valuable, and you could actually score a few bucks by selling it or just using the data yourself...
__________________
Custom Software Development, email: woj#at#wojfun#.#com to discuss details or skype: wojl2000 or gchat: wojfun or telegram: wojl2000
Affiliate program tools: Hosted Galleries Manager Banner Manager Video Manager
Wordpress Affiliate Plugin Pic/Movie of the Day Fansign Generator Zip Manager
woj is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 04-08-2017, 05:48 PM   #4
johnnyloadproductions
Account Shutdown
 
Industry Role:
Join Date: Oct 2008
Location: Gone
Posts: 3,611
Quote:
Originally Posted by woj View Post
what you are doing is a bit odd, all this drama so you can post your results on gfy and get 5 "interesting stats" replies? you are probably bored or trying to fine tune your skills, but why not crawl and parse something that has some value? like for example some tube site to discover most common keywords/niches/paysites/models/etc? that data would be 1000x more valuable, and you could actually score a few bucks by selling it or just using the data yourself...
Fine tune skills, deliberate practice, and an itch I wanted to scratch.
Those are all good suggestions, and something I can spin off or work with several people with.
I'd be willing to work with people in the future in some kind of partnership for data and stat gathering.

I like your posts woj, even if I troll you some.
johnnyloadproductions is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 04-09-2017, 06:57 AM   #5
Kittens
👏 REVOLUTIONARY 👏
 
Kittens's Avatar
 
Industry Role:
Join Date: Jan 2016
Posts: 1,440
dump that shit into firebase, nobody wants to download a 6TB csv.
__________________
Kittens is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 04-09-2017, 07:00 AM   #6
johnnyloadproductions
Account Shutdown
 
Industry Role:
Join Date: Oct 2008
Location: Gone
Posts: 3,611
Quote:
Originally Posted by Kittens View Post
dump that shit into firebase, nobody wants to download a 6TB csv.
Nothing above 14MB in size. I added the parser.py and database.py. Have fun.
johnnyloadproductions is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 04-09-2017, 07:21 AM   #7
adultchatpay
Let's Make Money
 
adultchatpay's Avatar
 
Industry Role:
Join Date: Dec 2008
Posts: 8,784
interesting stats
adultchatpay is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Post New Thread Reply
Go Back   GoFuckYourself.com - Adult Webmaster Forum > >

Bookmarks

Tags
profile, account, pretty, hours, python, nice, iceweasel, curious, data, csv, gfy, profilexxxx.html, friendly, pymysql, library, parsing, text, beautifulsoup, bot, robust, writing, parser, urls, fortunately, webmaster



Advertising inquiries - marketing at gfy dot com

Contact Admin - Advertise - GFY Rules - Top

©2000-, AI Media Network Inc



Powered by vBulletin
Copyright © 2000- Jelsoft Enterprises Limited.