No announcement yet.

Text Compare Replacement with Regular Expressions

  • Filter
  • Time
  • Show
Clear All
new posts

  • Text Compare Replacement with Regular Expressions


    I've got 2 files. They have many differences, and some I want to ignore. I can make lines unimportant using Replacements (without regular expressions), eg. fixed text of "Monday" in the left file and "Tuesday" in the right file. Good so far.

    However, there's more. In the left file, I may have a version number like "v7.2.3" or "v7.3.4", and in the right file have "v7.6.7" or "v7.4.5". They may or may not be the same version numbers. I want to ignore all these.

    I tried using regular expression replacement item like this...

    Text to find: v7\.\d+\.\d+
    Replace with: v7\.\d+\.\d+

    This didn't work. Tried...
    Text to find: v7\.\d+\.\d+
    Replace with: .

    Text to find: v7\.(\d+\.\d+)
    Replace with: v7\.$1

    Still no luck. For all the above the Regular Expression box was ticked, and Side=Left.

    I also tried various combinations, duplicating, trying the right side aswell / instead.

    If I do this...
    Text to find: v7\.\d+\.\d+
    Replace with: v7.2.2

    Then it works for all various versions on the left, but only for v7.2.2 on the right. Obviously. But I don't want to be limited on this in the right file.

    Anyone know the solution to this ?



  • #2
    Okay, not got it working with Regular Expression Replacements, but done what I need via Grammer Rules. I found Thread 4417 "XML Datetime Compare" raised by Herpel and solved by Michael Bulgrien.

    Basically, I set up a new grammer rule called "VersionElement", with grammer of basic regular expression "v\d+\.\d+\.\d+", and made sure I unticked "VersionElement" in the list of grammer elements. It's then treated as unimportant.

    If someone knows a better way, then feel free to comment. It works though, which is the main thing! Cheers Michael!



    • #3

      Glad you were able to work out a solution so far. The trick with replacements is the "replace with" section needs to be defined and not contain wildcard characters/sections.

      You have a proper definition with:
      Text to find: v7\.(\d+\.\d+)
      Replace with: v7\.$1

      But, what this does is $1 gets the value of what was found within the (parenthesis). So, it only matches as a replacement when the text v7.number with v7.number where number is equal to number. This was probably not the result you were looking for.

      From your example, an unimportant grammar element sounds like the best way to tackle this issue. It will treat the version section as unimportant, even if it aligns to a non-version section, but that non-version section is not covered by this definition, and would find as an important difference (unless it, too, had an unimportant definition covering its case).
      Aaron P Scooter Software


      • #4
        Thanks for your reply.

        Unfortunately, it looks like I've not quite got it working. It worked okay on my test file, but then didn't on my XML files I was comparing.

        I created a new "XML with Versions" text format which I can then pick on the file compare window, it has my new "VersionElement" grammer element, unticked to make unimportant, and editing the grammer to move it up to top of the importance list.

        Bit more fiddling to try.....

        it's just a case of I want a line like this...
        <UPDATE file="blah" path="http://blah/v1.2.3/test"/>
        to compare with a file with a line like this...
        <UPDATE file="fred" path="http://blah/v1.3.4/test"/>

        and it to only pick up on the difference of "blah" and "fred".

        But I don't want to hard-code the "v1.2.3" or "v1.3.4" bits (they may vary). Any help appreciated.




        • #5
          Since the item that you want to declare as unimportant is imbedded in another grammar element (a string) you will need to remove the other grammar element:

          Delete the String grammar element

          Create a new grammar element

          Element name: Version
          Category: Basic
          Text matching: v\d+\.\d+\.\d+
          Regular Expression
          Not case sensitive

          The contents of your other strings will still be processed under the "Everything else" category on the Importance tab.
          BC v4.0.7 build 19761


          • #6
            I am comparing the assembly files generated by different C compiler versions, which contain labels of the form L\d+.
            After failing with the double-regular-expression method, I also tried creating a grammar element for my problem. This sort-of worked, in that inserting a label (and therefore incrementing the number of all subsequent labels) resulted in unimportant differences. However, the inserted label itself was also unimportant.
            I want to be able to flag a CHANGE as unimportant, but not an insertion or deletion - I think a replacement with regexps on both sides would permit this.


            • #7
              Hi Michael,

              Great, thanks. That worked a treat. I see it was because the string grammar was getting in on the deal.

              So with my test files with say
              hello v7.1.2
              hi v7.2.3

              it worked fine, showing just "hello" and "hi" as differences. The v7.x.x bit successfully ignored.

              When having the real file which has
              hello "v7.1.2"
              hi "v7.2.3"

              it then didn't work as the string grammar rule then decides it IS important even though the Version rule said it wasn't important.

              Maybe we could have something in BC3 for the future where you can have the grammar elements setup with some kind of logic like
              (GrammarRule1 AND GrammarRule2) OR (GrammarRule3) OR (GrammarRule4)

              Not sure how that would work, but basically I wanted it to match my "Version" grammar rule which was higher up the order, and then jump out and not process the "String" rule.

              Anyway, thanks for the help and solution.

              It's an excellent product anyway. Wouldn't know what to do without it now!