perl question

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • fris
    Too lazy to set a custom title
    • Aug 2002
    • 55679

    #1

    perl question

    is their something like mechanize for a local file?


    Code:
    #!/usr/local/bin/perl
    
    use WWW::Mechanize;
    
    binmode(STDOUT, ":utf8");
    
    my $url  = "http://domain.com/bookmarks.html";
    my $mech  = WWW::Mechanize->new();
    $mech->get( $url );
    my @links = $mech->links();
    
    foreach my $link (@links) {
       print $link->url() . "|" . $link->text() . "\n";
    }
    like this but for a local file?
    Since 1999: 69 Adult Industry awards for Best Hosting Company and professional excellence.
  • Tempest
    Too lazy to set a custom title
    • May 2004
    • 10217

    #2
    No idea.. I've always user HTML::TokeParser

    Comment

    • Barry-xlovecam
      It's 42
      • Jun 2010
      • 18083

      #3
      split=/regex/ works

      This looks sort of nasty but works.
      I used a Firefox bookmarks file.
      Some adjustments to the splits might be necessary for their regexes ...

      You don't need to use some complex module to manipulate a line of text in Perl.
      Perl has a very complex and efficient regex engine in its core distribution.
      I am sure there are more elegant ways to regex this but this works

      Code:
      #!/usr/bin/perl
      ####################################
      #bookmarkfile.cgi
      #
      #
      #
      #
      ####################################
      use CGI::Carp qw/fatalsToBrowser/;
      use CGI qw/:standard/;
      use strict;
      use warnings;
      
      print "Content-type: text/html\n\n";
      
      my $mystuff = "$ENV{'QUERY_STRING'}";
      	if ($mystuff =~ s/[^a-zA-Z0-9\_]//g) {print qq~HUH???~;       exit;}
      
      my $bookmarkfile="barry-bookmarks-6-2010.html";
      
      open BOOKMARKFILE ,"<",$bookmarkfile or die "<bookmarkfile";
      
      #######SAMPLE LINE
      #        <DT><A HREF="http://trends.google.com/websites?q=xlovecam.com&geo=all&date=all&sort=0
      #" ADD_DATE="1274745854" LAST_MODIFIED="1274745854">Google Trends for Websites: xlovecam.com</A>
      #######
      
      my @bookmarks=(<BOOKMARKFILE>);
      
      	my @urls = grep /(http:)/, @bookmarks;
      
      			foreach my $urls (@urls){
      				my @a= split /HREF=\"/, $urls;
      				my @b= split /" ADD_DATE/,$a[1];
      				my @anchor1 = split />/, $a[1];
      				my @anchor = split /</, $anchor1[1];
      
      				print "$b[0]|$anchor[0]<br/>\n";
      			     }
      outputs:
      Code:
      http://trends.google.com/websites?q=xlovecam.com&geo=all&date=all&sort=0|Google Trends for Websites: xlovecam.com
      Last edited by Barry-xlovecam; 05-26-2011, 07:48 PM.

      Comment

      • DangerX !!!
        Confirmed User
        • Feb 2011
        • 886

        #4
        I will ask my girl later, she does a lot of Perl. Myself, I've always preferred Python over Perl, much cleaner etc.
        This is sig area!

        Comment

        • fris
          Too lazy to set a custom title
          • Aug 2002
          • 55679

          #5
          actually instead of http://domain.com/bookmarks.html file:bookmarks.html works
          Since 1999: 69 Adult Industry awards for Best Hosting Company and professional excellence.

          Comment

          • Barry-xlovecam
            It's 42
            • Jun 2010
            • 18083

            #6
            open the read file <
            open the write file if necessary > or >>

            print WRITEFILE "data ...\n";

            If a module will handle many events install it. Problem is in the module installation. There are a lot of cases where users do not have access to root and use of the CPAN shell.

            Considering your prior attempt at using sed for this, I am assuming this is for local use of some sort.

            WWW::Mechanize is an interesting module.

            Comment

            • u-Bob
              there's no $$$ in porn
              • Jul 2005
              • 33063

              #7
              could always use the update_html method </ugly hack>

              Comment

              • fris
                Too lazy to set a custom title
                • Aug 2002
                • 55679

                #8
                Originally posted by Barry-xlovecam
                open the read file <
                open the write file if necessary > or >>

                print WRITEFILE "data ...\n";

                If a module will handle many events install it. Problem is in the module installation. There are a lot of cases where users do not have access to root and use of the CPAN shell.

                Considering your prior attempt at using sed for this, I am assuming this is for local use of some sort.

                WWW::Mechanize is an interesting module.
                ya just for local, to take the bookmarks and just make <a href links from console, cause chrome bookmark export is ugly with tables.
                Since 1999: 69 Adult Industry awards for Best Hosting Company and professional excellence.

                Comment

                Working...