GoFuckYourself.com - Adult Webmaster Forum

GoFuckYourself.com - Adult Webmaster Forum (https://gfy.com/index.php)
-   Fucking Around & Business Discussion (https://gfy.com/forumdisplay.php?f=26)
-   -   Damn Duplicates -- p00f begone! (https://gfy.com/showthread.php?t=1057221)

Barry-xlovecam 02-11-2012 12:38 PM

Damn Duplicates -- p00f begone!
 
I wrote a fast file cleaner the grep use is interesting ...
Clean your lists up, etc. ...

Code:

#!/usr/bin/perl
####################################
# nodupes.cgi
#you may use free as-is with  no warranty
#make the outfile's chmod 666 if using the webserver
#chmod this script to 755
####################################
use CGI::Carp qw/fatalsToBrowser/;
use CGI qw/:standard/;

print "Content-type: text/html\n\n";

my $query = "$ENV{'QUERY_STRING'}";
        if ($query =~ s/[^a-zA-Z0-9\_]//g) {print qq~HUH???~;      exit;}

my $infile="somesiteurl.txt";
my $outfile="somesiteurlduped.txt";

  open(INPUT, "<", $infile) || die "infile not found\n";
      my @array=(<INPUT>);
  open(OUTPUT, ">>",$outfile )|| die "outfile not found\n";


        my %seen = ();
        my @unique = grep { ! $seen{ $_ }++ } @array;


            foreach my $unique(@unique){
                  chomp $unique;
                  print OUTPUT "$unique\n"
                    }

close OUTPUT;
close INPUT;


V_RocKs 02-11-2012 02:33 PM

Nice.... reminds me of 2000...

mikke 02-11-2012 02:52 PM

can you port it to brainfuck?

Barry-xlovecam 02-11-2012 04:28 PM

Quote:

Originally Posted by mikke (Post 18753155)
can you port it to brainfuck?

Something understandable ...

fris 02-11-2012 06:52 PM

Quote:

Originally Posted by Barry-xlovecam (Post 18753323)
Something understandable ...

or you could just use cat file.txt | sort -u

;)

much quicker

Barry-xlovecam 02-12-2012 07:33 AM

Code:

cat infile.txt|sort -u > outfile.txt
No spaces and the outfile

That is a lot easier fris, ty

alextokyo 02-12-2012 07:47 AM

Someone said something about a poof?


http://i40.tinypic.com/aua3kn.jpg

Barry-xlovecam 02-12-2012 08:08 AM

p00f not poof -- comprehension problems?

fris 02-12-2012 08:29 AM

Quote:

Originally Posted by Barry-xlovecam (Post 18754251)
Code:

cat infile.txt|sort -u > outfile.txt
No spaces and the outfile

That is a lot easier fris, ty

awk way to do it, to remove dups without sorting

Code:

awk '!x[$0]++' file.txt
perl without sorting

Code:

perl -ne 'print if !$a{$_}++'
this would remove dupe entries on a file with a single column

awk

Code:

awk '{ if ($1 in stored_lines) x=1; else print; stored_lines[$1]=1 }' infile.txt > outfile.txt
perl

Code:

perl -ane 'print unless $x{$F[0]}++' infile > outfile
sunday gfy bonus

count and show duplicate file names

Code:

find . -type f  |sed "s#.*/##g" |sort |uniq -c -d
extra bonus

fild duplicate files based on filesize, then md5 hash

Code:

find -not -empty -type f -printf "%s\n" | sort -rn | uniq -d | xargs -I{} -n1 find -type f -size {}c -print0 | xargs -0 md5sum | sort | uniq -w32 --all-repeated=separate
:pimp:pimp


All times are GMT -7. The time now is 02:38 AM.

Powered by vBulletin® Version 3.8.8
Copyright ©2000 - 2025, vBulletin Solutions, Inc.
©2000-, AI Media Network Inc123