![]() |
need to de-dupe keyword list... solution?
I have a list of keywords, one phrase or keyword per line
the list has alot of duplicates... whats the best way to strip them out? help please :helpme |
One approach:
-Import to excel -sort alphabetically -run a formula comparing each entry to the one above and below, and mark it as a dupe (or delete it) -for example =IF(OR(A3=A4,A3=A2),"Duplicate","") -then sort by the duplicate status and delete ish. |
Here is a less manual excel approach that I haven't tested, but it looks damn sexy:
http://www.rondebruin.nl/easyfilter.htm |
what about a solution for people that dont have excel?
I dont have any office applications |
If you have access to *nix, try:
$ cat list.txt|uniq > newlist.txt |
in excel you dont need a formula to remove dupes, theres a feature to show only non dupes. in older versions its called something like 'show original content' in 2007 under data tab its just called remove dupes
|
Quote:
|
someone help me out with the syntax error on line 18 please?
Code:
#!/usr/bin/perl |
You must sort before you can uniq:
cat infile | sort | uniq > outputfile |
Quote:
thanks man :) |
While we are on the subject, does anybody have a good query for deduping mysql tables across multiple fields?
|
That multiple fields bit isn't super clear.. but if you want to combine data in several columns of one table into a single unique column create a new table with one column that has unique index on it. Then for each of the columns in the old table:
insert ignore into newtable (newcolumn) select oldcolumn1 from oldtable; insert ignore into newtable (newcolumn) select oldcolumn2 from oldtable; If you just want to keep all unique rows then create new table with the same column structure, create a unique index across all columns, then: insert ignore into newtable select * from oldtable |
Quote:
sort -u infile > outfile |
All times are GMT -7. The time now is 07:23 AM. |
Powered by vBulletin® Version 3.8.8
Copyright ©2000 - 2025, vBulletin Solutions, Inc.
©2000-, AI Media Network Inc123