View Single Post
Old 05-29-2017, 11:56 PM  
deonbell
Confirmed User
 
deonbell's Avatar
 
Industry Role:
Join Date: Sep 2015
Posts: 1,045
Quote:
Originally Posted by Barry-xlovecam View Post
step 1 curl the page and > save
step 2 oneliner parse and save the data

Code:
 sed 's/>/>\n/g' bitcoin2.html|egrep '/bitcoin/address/'|cut -d'/' -f6|cut -d'"' -f1  |less
Why make things so complex?

>> wallets.csv
then;
mysql>
LOAD DATA LOCAL INFILE
I did not know of sed command. Or I would have. But since I have the code written. I put it in a loop.

Process
1. Use curl to download the first 80 Naming them 1.html, 2.html, 3.html and so on.
2. Run my program that will parse all 80 files.
Code:
import sys

searchstring = "https://bitinfocharts.com/bitcoin/address/"

filecount = 1
while filecount < 81:
   searchfile = open(str(filecount)+".html", "r")
   for line in searchfile:
     if searchstring in line:
   #     bitaddress = line.split('"')

         htmlsplit = line.split(searchstring)
         counter = 1
         trimfile = open(str(filecount)+'.txt', 'w')
         while counter < 101: 
             left_text = htmlsplit[counter].partition("\"")[0]
             print left_text
             trimfile.write(left_text.strip()+'\n')
             counter = counter + 1

         trimfile.close() 
         filecount = filecount + 1
   searchfile.close
3. load data local 1.txt
up arrow and change to 2.txt and on and on.

Thank you for your help my friend.
You are the Kirk to my Khan.
deonbell is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote