In order to match pretty much any HTML, here's what I use:
PHP Code:
preg_match_all("/<a.*href=[\"|\']?([^\"|^\'|^\s|^\>]+)[\"|\']?.*><img.*src=[\"|\']?([^\"|^\'|^\s|^\>]+)[\"|\']?.*><\/a>/im", $tmpStr, $matches);
I'm sure it could be optimized a bit, but sometimes .* gets greedy so I like to have extra checks in there. You'll find it a lot easier to parse out the whole HTML page, then remove any linebreaks and run that regexp on the whole thing. That way, any links or image tags that span more than one line will match.
