|   |   |   | ||||
| Welcome to the GoFuckYourself.com - Adult Webmaster Forum forums. You are currently viewing our boards as a guest which gives you limited access to view most discussions and access our other features. By joining our free community you will have access to post topics, communicate privately with other members (PM), respond to polls, upload content and access many other special features. Registration is fast, simple and absolutely free so please, join our community today! If you have any problems with the registration process or your account login, please contact us. | 
|    | 
| 
 | |||||||
| Discuss what's fucking going on, and which programs are best and worst. One-time "program" announcements from "established" webmasters are allowed. | 
|  | Thread Tools | 
|  12-30-2005, 03:23 AM | #1 | 
| Confirmed User Join Date: Mar 2004 
					Posts: 5,116
				 | 
				
				A program that can extract data from a list of URLs?
			 Anyone know a program that can extract data from a list of URLs? I want to be able to set where in the document it should start to grab data. Example: Start from: <a href=" Stop at: "> Is there any program that can do that?  | 
|   |           | 
|  12-30-2005, 05:05 AM | #2 | 
| there's no $$$ in porn Industry Role:  Join Date: Jul 2005 Location: icq: 195./568.-230 (btw: not getting offline msgs) 
					Posts: 33,063
				 | perl is your friend. | 
|   |           | 
|  12-30-2005, 05:44 AM | #3 | 
| Confirmed User Join Date: Feb 2004 Location: United Kingdom 
					Posts: 575
				 | Ya I was gonna say Perl as well check out m{href\=\"(.*?)\"} or the module tokenparser. Have fun!!! | 
|   |           | 
|  12-30-2005, 05:55 AM | #4 | 
| Confirmed User Join Date: Sep 2003 
					Posts: 8,713
				 | Regex is your friend..  
				__________________  TrafficCashGold Paying Webmasters Since 1996! Awesome Conversions! Fast Weekly Payments! Over 125 Tours! | 
|   |           | 
|  12-30-2005, 06:02 AM | #5 | |
| Confirmed User Join Date: Jan 2005 
					Posts: 422
				 | Quote: 
 
				__________________ SIG TOO BIG! Maximum 120x60 button and no more than 3 text lines of DEFAULT SIZE and COLOR. Unless your sig is for a GFY top banner sponsor, you may use a 624x80 instead of a 120x60. Let me repeat... A 120 x 60 button and no more that 3 lines of DEFAULT SIZE AND COLOR text. | |
|   |           | 
|  12-30-2005, 06:08 AM | #6 | 
| Confirmed User Join Date: Mar 2004 
					Posts: 5,116
				 | I could program something in visual basic... but I know there have to be some applications out there already for this. (and im also lazy) | 
|   |           | 
|  12-30-2005, 06:41 AM | #7 | 
| Confirmed User Join Date: Feb 2004 Location: If i was up your ass you'd know 
					Posts: 3,695
				 | perl was designed to do exactly that. Php is pretty good for doing it as well. | 
|   |           | 
|  12-30-2005, 09:48 AM | #8 | 
| see you later, I'm gone Industry Role:  Join Date: Oct 2002 
					Posts: 14,122
				 | Here is a way to do it without regex: <?php // buffer is a variable to hold the data we are working on $buffer=''; // set vars for the beginning of what we want to parse and the end of what we want to parse $begin_pattern='<a href="'; $end_pattern='">'; // set up var for data being extracted. This could be an array or string to write to a file whatever // here I am just using it to echo the data extracted $dataout=''; // set file2read to point at the path and file that the list is stored in $file2read='testfile.txt'; // open the file $filein=fopen('testfile.txt','r'); // suck the entire file into a variable while (!feof($filein)){ $buffer=$buffer . fgets($filein); } // close the file fclose($filein); // check to make sure we got something out of the file if ($buffer>''){ // do this while any occurences of the beginning pattern are still in the data while( substr_count(strtolower($buffer),$begin_pattern)>0 ){ // trim the data to just past the next beginning pattern occurence $buffer=substr($buffer, strpos(strtolower($buffer),$begin_pattern)+strlen( $begin_pattern)); // pull the data in from where we trimmed the data to the occurence of the next end pattern $dataout=substr($buffer,0,strpos($buffer,$end_patt ern)); // trim the buffer by the length of the data we pulled $buffer=substr($buffer,strlen($dataout)); // output the data we pulled - could go into an array here or write it to a file whatever echo $dataout . '<br>'; } } ?> takes a file that looks like this: <a href="testurl1.com">crapcrapcrap<a href="testurl2.com">morecrapmorecrap<a href="testurl3.com">yesevenmore<a href="testurl4.com">awholelottacrap<a href="testurl5.com"><a href="testurl6.com"><a href="testurl7.com"><a href="testurl8.com"><a href="testurl9.com"><a href="testurl10.com"><a href="testurl11.com"><a href="testurl12.com"><a href="testurl13.com"><a href="testurl14.com"><a href="testurl15.com"><a href="testurl16.com"><a href="testurl17.com"><a href="testurl18.com"><a href="testurl19.com"><a href="testurl20.com"><a href="testurl21.com"><a href="testurl22.com"><a href="testurl23.com"><a href="testurl24.com"><a href="testurl25.com"> and outputs it like this: testurl1.com testurl2.com testurl3.com testurl4.com testurl5.com testurl6.com testurl7.com testurl8.com testurl9.com testurl10.com testurl11.com testurl12.com testurl13.com testurl14.com testurl15.com testurl16.com testurl17.com testurl18.com testurl19.com testurl20.com testurl21.com testurl22.com testurl23.com testurl24.com testurl25.com 
				__________________ All cookies cleared! | 
|   |           | 
|  12-30-2005, 10:58 AM | #9 | 
| <&(©¿©)&> Industry Role:  Join Date: Jul 2002 Location: Chicago 
					Posts: 47,882
				 | if you want a custom solution, icq: 33375924 
				__________________ Custom Software Development, email: woj#at#wojfun#.#com to discuss details or skype: wojl2000 or gchat: wojfun or telegram: wojl2000 Affiliate program tools: Hosted Galleries Manager Banner Manager Video Manager  Wordpress Affiliate Plugin Pic/Movie of the Day Fansign Generator Zip Manager | 
|   |           | 
|  12-30-2005, 11:27 AM | #10 | 
| Confirmed User Join Date: Mar 2004 
					Posts: 5,116
				 | Wow, thank you VERY much sarettah! =)) | 
|   |           | 
|  12-30-2005, 11:41 AM | #11 | |
| see you later, I'm gone Industry Role:  Join Date: Oct 2002 
					Posts: 14,122
				 | Quote: 
 You're welcome. However, looking back I have a code error in there from when I was making it "friendly" where I have: // open the file $filein=fopen('testfile.txt','r'); Change it to: // open the file $filein=fopen($file2read,'r');  
				__________________ All cookies cleared! | |
|   |           | 
|  12-30-2005, 11:43 AM | #12 | 
| Registered User Join Date: Aug 2005 
					Posts: 3,570
				 | hey sweet   i'm going to play with this too | 
|   |           |