GoFuckYourself.com - Adult Webmaster Forum

GoFuckYourself.com - Adult Webmaster Forum (https://gfy.com/index.php)
-   Fucking Around & Business Discussion (https://gfy.com/forumdisplay.php?f=26)
-   -   Question for php and regex gurus... (https://gfy.com/showthread.php?t=585640)

BigBen 03-11-2006 03:16 PM

Question for php and regex gurus...
 
I need to parse html for any image tags and store all of them in an array...

So if this is the html:

blah blah <img src="asdf.jpg" width="150"> more blah <img border="1" src="kasdf.gif">

The array should be:

$images[0] = asdf.jpg
$images[1] =kasdf.gif

Anybody know how I can do this?

mortenb 03-11-2006 03:25 PM

I'm thinking something similar to this:

Code:

preg_match_all("|src\=\"?'?`?([[:alnum:]:?=&@/._+-]+)\"?'?`?|i", $string, $matches);

BigBen 03-11-2006 04:04 PM

Quote:

Originally Posted by mortenb
I'm thinking something similar to this:

Code:

preg_match_all("|src\=\"?'?`?([[:alnum:]:?=&@/._+-]+)\"?'?`?|i", $string, $matches);


That works great. Thank you!

Why 03-11-2006 04:06 PM

there are things other then images that use src= so you might want to be more explicit with that regex

BigBen 03-11-2006 04:13 PM

Quote:

Originally Posted by Why
there are things other then images that use src= so you might want to be more explicit with that regex


Good point. It matches anything after src even if it's not in an img tag. Any ideas?

Why 03-11-2006 04:38 PM

Code:

preg_match_all("|src\=\"?'?`?([[:alnum:]:?=&@/._+-]+)+\.(gif|GIF|jpg|JPG)\"?'?`?|i", $string, $matches);
might work. but i didnt test it. it would only check for gifs and jpegs though.

BigBen 03-11-2006 08:33 PM

Thanks for the help. What can I use to match just the image name from the string extracted from the first match?

Ie. How do I match just pic.jpg out of: src="http://example.com/images/pic.jpg"

I tried: preg_match("/\/.+?\.jpg/", $matches[1], $imagematch);
but that matches the longest string possible (//example.com/images/pic.jpg)

Thanks.

psili 03-11-2006 08:36 PM

Quote:

Originally Posted by BigBen
Thanks for the help. What can I use to match just the image name from the string extracted from the first match?

Ie. How do I match just pic.jpg out of: src="http://example.com/images/pic.jpg"

I tried: preg_match("/\/.+?\.jpg/", $matches[1], $imagematch);
but that matches the longest string possible (//example.com/images/pic.jpg)

Thanks.

Maybe do a greedy match at first, just to get all the full & relative paths to the actual images grabbed out of the src tags. Then implode that $matched array into a string. Against that string, do a more refined match to just grab out the image names.

BigBen 03-11-2006 08:45 PM

Quote:

Originally Posted by psili
Maybe do a greedy match at first, just to get all the full & relative paths to the actual images grabbed out of the src tags. Then implode that $matched array into a string. Against that string, do a more refined match to just grab out the image names.

I'm having trouble doing that second part. My first regex will store the full path and I need to do a match against that string for the single image.


All times are GMT -7. The time now is 03:05 AM.

Powered by vBulletin® Version 3.8.8
Copyright ©2000 - 2026, vBulletin Solutions, Inc.
©2000-, AI Media Network Inc123