No announcement yet.

Unique anchor points?

  • Filter
  • Time
  • Show
Clear All
new posts

  • Unique anchor points?

    I'm trying to compare a ~65k auto-generated text file for difference in a couple sections while I iterate the generation process.

    My problem is that a text comparison gets confused fairly quickly about section layouts. Which is to be expected, I'm not expecting it to parse arbitrary formats perfectly.

    There are several unique sections that I can manually "Align With..." using text searches for for the unique strings to fix this up and get a much better diff. But as soon as I generate new output for comparison. I lose all those manual alignments.

    I'd like to insert some firm anchoring points as hints to beyond compare to shortcut the manual "Align With..." steps. I can guarantee that my anchors are a unique line in each file that perfectly matches a single line in the other.

    I've been trying to use the "Everything Else" file format grammar options to accomplish this. I've got a grammar rule correctly identifying what I call "Complex Text Comparison" formats. But if I copy the relevant matching rule down to "Line weights" and give it a priority of 5. I can't seem to affect any difference in the automatic comparison.

    In my test file. The first relevant anchor in each file is separated by ~5k lines. Is there a maximum distance at play here? Is there a super secret hinting system I can leverage to make rock solid unique anchoring points?

  • #2

    In the Text Compare's Session menu -> Session Settings, Alignment tab, if you switch to different alignment algorithms, does this help automatically handle your files?

    A combination of grammar element definitions and alignment algorithms can influence how the overall alignment can work, but the algorithm switch alone might be enough to help handle your files.

    Line Weights are useful for 'breaking ties' when the alignment algorithm has to choose between a tie break, but does not function as a full anchor.
    Aaron P Scooter Software


    • #3
      Switching to a "Patience Diff Alignment" seems to have fixed the unique anchors. I'm unable to tell yet whether the standard diffs from there on are as good.

      Thanks, I wasn't aware of the session alignment settings.

      I'm still curious if I can setup the high priority lines to just do a patience pass on them first. Followed by whatever the session settings happen to be.

      My end goal is to export a rule set that coworkers can import. And I'd like to have a light touch of just the grammar additions.


      • #4

        Your export will need to include both the File Format and the default Text Compare session settings. Alignment and Importance are Session Settings (such as if Comments are Unimportant or Important by default), so there isn't a single package for a specific extension, since some settings are not extension based. The alignment algorithm does not support a 'two pass' method; it's a specific selection that is applied on load. Different files can align better with different algorithms; one of Patience Diffs strengths is finding and working with detected brackets as anchors. Is there a factor where you think Standard would help better for these files?
        Aaron P Scooter Software