GoFuckYourself.com - Adult Webmaster Forum

GoFuckYourself.com - Adult Webmaster Forum (https://gfy.com/index.php)
-   Fucking Around & Business Discussion (https://gfy.com/forumdisplay.php?f=26)
-   -   Excel question - removing duplicate words (https://gfy.com/showthread.php?t=992516)

mkx 10-14-2010 02:44 PM

Excel question - removing duplicate words
 
I want to eliminate duplicate words in excel. Let's say I have column A1 that reads =

John thinks pizza pizza is better than pappa john

Excel will automatically rewrite it in Column B1 =

John thinks pizza is better than pappa

It needs to be a universal formula because it's not just pizza and john that is duplicated, got about 10,000 rows of data with duplicate words and I am working on a keyword list with limited length allowance.

Hope someone can help, I asked my friend and google, and they are smart but they couldn't help.

Adraco 10-14-2010 03:11 PM

There might be a more direct way in Excel that I do not know of, but this is how I would solve it, quick and dirty:

Export all to a comma separated file.
Search and replace all "spaces" by a comma.
Import into Excel, with each word in it's own column.
Then do some IF conditions, IF B1=A1; "then delete B1" ELSE leave B1 as is and proceed to C1.

mkx 10-14-2010 03:25 PM

yah i guess thats one way to do it :) i wonder if theres a formula for it though

Lilit 10-14-2010 04:34 PM

I doubt there is a formula cause if I understood it correctly you have to define which dupe to remove and it's going to be specific for each particular case.

If you don't care which of the dupes to remove then I have a solution for sentences consisting of up to 8 words.

First of all do what Adraco said, separate each word in a separate cell.

Then here's your formula:

IF((AND(A1<>B1,A1<>C1,A1<>D1 and etc up to 8 conditions)),A1,"blahblah")

Make sure that the 1st output cell is in 8+ columns from the last cell in the initial input

When done copy the output, go to a separate list and do special paste, select paste values. Then run search and replace for the "blahblah), empty the cells with the blahblah.

Then copy the output and paste in notepad, select the space between two words (it will be the Tab sign) run search and replace. replace it with a space. This function can create double spaces, you can remove them later.

Hope this helps to solve at least a part of your problem :winkwink:

Lilit 10-14-2010 04:38 PM

oh lol...I've just realized that basically I suggested the same as Ardaco, but from "the other end" :)

Bottom line, if you need to control sense of the sentences I doubt there is a way to automate the dupe removal.

PornMD 10-14-2010 06:29 PM

Quote:

Originally Posted by Adraco (Post 17607617)
There might be a more direct way in Excel that I do not know of, but this is how I would solve it, quick and dirty:

Export all to a comma separated file.
Search and replace all "spaces" by a comma.
Import into Excel, with each word in it's own column.
Then do some IF conditions, IF B1=A1; "then delete B1" ELSE leave B1 as is and proceed to C1.

This sort of thing is what I'd suggest. One thing you can do if you need to review all the cases instead of simply delete any such case is to have an etra column do all the checks with "STOP" if it finds a dupe and blank if it doesn't. You can then copy the column and paste in only values, and then starting from the top, if you hit end and then down, you'll go to the next "STOP", check/fix it and delete the "STOP", rinse and repeat until there's no more "STOP"s. Just suggesting that since there are actual cases in grammar when words can be repeated, in which case an automatic fix will have just screwed the sentence.


All times are GMT -7. The time now is 04:58 AM.

Powered by vBulletin® Version 3.8.8
Copyright ©2000 - 2025, vBulletin Solutions, Inc.
©2000-, AI Media Network Inc123