View Single Post
Old 10-11-2011, 02:05 PM  
raymor
Confirmed User
 
Join Date: Oct 2002
Posts: 3,745
Quote:
I want to clean a database of words that have bad letters in them. I know how to remove bad characters, but how can I remove the word along with it?

"The qu&&ick brown fox ju&&&mped over the lazy dog."

assuming & is a bad character, how can I end up with

"The brown fox over the lazy dog."

Basically, anything word that doesn't have an alpha-numeric or [.,<>?~@#$%^&*()] I want to remove the word.
You've asked for two very different things. Removing words that DO have bad characters is different from removing words than do NOT have "good" characters. What if it has both?

To remove words that have the "bad" character:

\w is the class of word characters. You're looking for a string containing at least one "bad character" and optionally some word characters.
"Words", as you define them, are strings of word characters and &, which is represented as \w|& .
So your assuming & is the bad character, the regular expression is:

(\w|&)*&(\w|&)*

preg_replace('/(\w|&)*&(\w|&)*/', "", $subject);


Quote:
Basically, anything word that doesn't have an alpha-numeric or [.,<>?~@#$%^&*()] I want to remove the word.
Removing them based on what they do NOT have is a different thing than removing things based on what they DO have as above. In this case, you're looking for strings of [^.,<>?~@#$%^&*()], bracketed by space characters I suppose since you have .,? and other non-word characters part of your class.
So you're looking for:
\s[^.,<>?~@#$%^&*()]+\s

and replacing it with a single space delimiter like this:

preg_replace('/\s[^.,<>?~@#$%^&*()]+\s/', " ", $subject);
__________________
For historical display only. This information is not current:
support&#64;bettercgi.com ICQ 7208627
Strongbox - The next generation in site security
Throttlebox - The next generation in bandwidth control
Clonebox - Backup and disaster recovery on steroids
raymor is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote