what i'm trying to achieve is that i'd like to extract all the html contained within a specified table. i have targeted the table, and i attempted to clip out the required html using "get_trimmed_text" but it parses the html as text, so all the html tags are not saved using this method. is there an equivelant to using get_trimmed_text I could use within HTML::TokeParser or should I be looking into a different module. IS there a funtion for trimming down html in WWW::Mechanize?
Code:
!/usr/bin/perl
use strict;
use warnings;
use HTML::TokeParser;
use LWP::Simple;
# extract.pl
print "Enter the page URL: ";
chomp( my $domain = <STDIN> );
print "Enter the output HTML filename: ";
chomp( my $html_output = <STDIN> );
my $content = get($domain) or die $!;
my $stream = HTML::TokeParser->new( \$content ) or die $!;
while ( my $tag = $stream->get_tag( "table" ) ) {
if ( $tag->[1]{cellpadding} and $tag->[1]{cellpadding} eq '8' ) {
# what do i do here?
}
}