GoFuckYourself.com - Adult Webmaster Forum

GoFuckYourself.com - Adult Webmaster Forum (https://gfy.com/index.php)
-   Fucking Around & Business Discussion (https://gfy.com/forumdisplay.php?f=26)
-   -   Shell wget/sed script help needed (https://gfy.com/showthread.php?t=1007896)

acctman 01-28-2011 02:16 AM

Shell wget/sed script help needed
 
I need help using sed to parse html codiing, this is what i'm trying to do...

1. wget to http://site.com/xap/wp7?p=1
2. view the html extract all the ProductName's from in between title="Free Shipping ProductName"> ... ex: title="Free Shipping HD7-Case001"> , HD7-Case001 is extracted.
3. processes up to page 50

Code:

#! /bin/sh

for ((  i = 1 ;  i <= 50;  i++  ))
do
        wget -q -O- "http://site.com/xap/wp7?p=$i" |
        sed ... Need the parsing part

done < "products.txt"


fris 01-28-2011 08:06 AM

your answer has been answered

http://stackoverflow.com/questions/4...g-wget-and-sed

fris 01-28-2011 08:07 AM

Code:

[chris@jumbo ~]$ cat test.html
<a href="http://www.domain.com" title="Free Shipping HD7-Case001">
link</a>
<a href="http://www.domain.com" title="Free Shipping HD2-Case001">link</a>
<a href="http://www.domain.com" title="Free Shipping HD3-Case001">link</a>
<a href="http://www.domain.com" title="Free Shipping HD7-Case009">link</a>
<a href="http://www.domain.com" title="Free Shipping HD7-Case002">link</a>

Code:

[chris@jumbo ~]$ cat test.html | tr '"' '\n' | grep "^Free Shipping " | cut -d ' ' -f 3
HD7-Case001
HD2-Case001
HD3-Case001
HD7-Case009
HD7-Case002


acctman 01-28-2011 08:51 AM

Quote:

Originally Posted by fris (Post 17875286)
Code:

[chris@jumbo ~]$ cat test.html
<a href="http://www.domain.com" title="Free Shipping HD7-Case001">
link</a>
<a href="http://www.domain.com" title="Free Shipping HD2-Case001">link</a>
<a href="http://www.domain.com" title="Free Shipping HD3-Case001">link</a>
<a href="http://www.domain.com" title="Free Shipping HD7-Case009">link</a>
<a href="http://www.domain.com" title="Free Shipping HD7-Case002">link</a>

Code:

[chris@jumbo ~]$ cat test.html | tr '"' '\n' | grep "^Free Shipping " | cut -d ' ' -f 3
HD7-Case001
HD2-Case001
HD3-Case001
HD7-Case009
HD7-Case002


thanks... everything worked out this morning i had an extra character that I missed typed.


All times are GMT -7. The time now is 03:48 AM.

Powered by vBulletin® Version 3.8.8
Copyright ©2000 - 2025, vBulletin Solutions, Inc.
©2000-, AI Media Network Inc123