Regular Expression Help

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • nibbi
    Confirmed User
    • Sep 2002
    • 104

    #1

    Regular Expression Help

    I need to remove style content code from within <td> tags on hundreds of html pages. I figure the easiest way to do this would be a regular expression and a text editor.

    So, anyone here know how to write a regular expression to turn this...

    <td height="15" class="xl24" style="height:11.25pt">

    Into this...

    <td>

    Thanks
    http://www.xRag.com
  • Lycanthrope
    Confirmed User
    • Jan 2004
    • 4517

    #2
    Maybe this can help you: http://www.htmlworkshop.com/srhtml98.html

    Comment

    • com
      Confirmed User
      • Aug 2003
      • 4541

      #3
      Originally posted by nibbi
      I need to remove style content code from within <td> tags on hundreds of html pages. I figure the easiest way to do this would be a regular expression and a text editor.

      So, anyone here know how to write a regular expression to turn this...

      <td height="15" class="xl24" style="height:11.25pt">

      Into this...

      <td>
      Thanks


      simple the hardest bit is you may have to debug some of my escaping...

      in VI:

      %s/\<td height\=\"15\" class\=\"xl24\" style\=\"height:11.25pt\"\>/\<td\>/g

      Real. Professional. Hosting.
      .:Expect Nothing Less:.
      320-078-843 :: www.realprohosting.com :: [email protected]

      Comment

      • com
        Confirmed User
        • Aug 2003
        • 4541

        #4
        you may or may not need to escape the colon and period... plus some of the <> may not need to be escaped although doing so shouldnt hurt the expression. hope this helps, im running out the door to LA or id test it for ya! ciao!

        Real. Professional. Hosting.
        .:Expect Nothing Less:.
        320-078-843 :: www.realprohosting.com :: [email protected]

        Comment

        • Dynamix
          G F Y not
          • Jan 2004
          • 2910

          #5
          Is it just <TD> tags? If so I can write something real quick to do it for you

          TGPFactory Full TGP Design & Installation Services
          ICQ 250 142 484 · AIM TGPDynamix · Email: patrick (at) tgpfactory (dot) com
          See who I am at AdultWhosWho.com!

          Comment

          • nibbi
            Confirmed User
            • Sep 2002
            • 104

            #6
            Originally posted by com

            in VI:

            %s/\<td height\=\"15\" class\=\"xl24\" style\=\"height:11.25pt\"\>/\<td\>/g

            Actually, the contents of the tags contain many, many different variations... so that won't work. I need to be able to clear out *anything* that is within the tag.
            http://www.xRag.com

            Comment

            • nibbi
              Confirmed User
              • Sep 2002
              • 104

              #7
              Originally posted by Dynamix
              Is it just <TD> tags? If so I can write something real quick to do it for you
              Yes, just <td> tags. Thank you.
              http://www.xRag.com

              Comment

              • com
                Confirmed User
                • Aug 2003
                • 4541

                #8
                shit well last tidbit before I leave, the character $ means "end of", do a little reading on your regexes will show you how to do delete from <td -through end of line or statement and close it off with a >.

                %s means match the following string
                %s/shit/fuck/g means replace all instances of shit with fuck. ^ is beginning of doc or line $ is end. good luck wish i had more time! ciao

                Real. Professional. Hosting.
                .:Expect Nothing Less:.
                320-078-843 :: www.realprohosting.com :: [email protected]

                Comment

                • nibbi
                  Confirmed User
                  • Sep 2002
                  • 104

                  #9
                  Got something to work:

                  Search for this:
                  <td[^>]*>

                  Replace with this:
                  <td>

                  This was done in TextPad.
                  http://www.xRag.com

                  Comment

                  • Dynamix
                    G F Y not
                    • Jan 2004
                    • 2910

                    #10
                    http://www.pimpts.com/dl/tdslicer.exe

                    Drag and drop files from Windows Explorer onto this. File is saved as the original but appended with _new..

                    ie:
                    original - c:\file.html
                    new - c:\file_new.html

                    Only parses .html and .htm files

                    TGPFactory Full TGP Design & Installation Services
                    ICQ 250 142 484 · AIM TGPDynamix · Email: patrick (at) tgpfactory (dot) com
                    See who I am at AdultWhosWho.com!

                    Comment

                    • nibbi
                      Confirmed User
                      • Sep 2002
                      • 104

                      #11
                      Originally posted by Dynamix
                      Cool tool, Thanks.

                      I used it on a few of the files and it worked perfectly. I used TextPad for the rest. Seemed to do the job faster.

                      Thanks again for helping me with this.
                      http://www.xRag.com

                      Comment

                      Working...