No announcement yet.

How To Begin - Comparing .Doc Files

  • Filter
  • Time
  • Show
Clear All
new posts

  • How To Begin - Comparing .Doc Files

    Hi Everyone

    I just got BC software and Im totally confused about it. Firstly I used diffdoc and decided to change it because it freezed when loading .doc files with extremly huge text inside. So I chose BC and so far, the files are loaded well but I have no idea how to use this. Help\Contents don't say anything. After loading the document, everything is in red color with some empty fields (many of them) in between. Since I just got it, I cannot judge the software and saying it doesn't work well or it doesn't do what I want, The problem is probably on my side - I don't know how to use it (in the way I want to) OR Im using it in the wrong way. What I want to have in the software is: Wherever (and not only in the same position!) the same word is in the both of .doc files, it shows this as the duplicate (two exsactly the same words in BOTH files). Also I would like to have shown the similar entries, for example (in this example is on the left side of ''-'' the entry / word located in the left section inside the program, on the right side of ''-'' is the entry / word located on the right section):

    Label - Label
    [ex; - [ex;
    Example (!!!!!) - Example
    Example - Exmple

    Please Note; in the third and forth example the entry isn't exsactly the same but I would still like to have this shown; not as duplicate (duplicate is in the first and second example) but as something like ''almost the duplicate'' or ''very similar''. So in the fourth example I didn't make the mistake - I typed ''Exmple'' (instead of ''Example'') on purpose.

    All those results that are find should NOT matter where are the entries (words) in the files.

    Hopefully someone could help or at least point me to some useuful tutorial. Thank you.

  • #2

    The section in the Help file you are looking for is under Using Beyond Compare: Text Compare.

    If you would like, please email us a pair of example files and a full screen screenshot to [email protected]
    The screenshot will let us know what you are seeing, and what you are confused about.
    If you are able to email us, please include a link back to this forum post.

    In your above example, Label and [ex; should line up and be equal/black text. Example should also be black, but the (!!!!!) should show as red text, and the a in the 4th line should be red. If there is a difference anywhere on the line, then the whole line's background color will be red, but only the actual different text should be colored red as a foreground color. If you have black text on a red background, that means the difference is elsewhere in the line. Try turning on Show Whitespace in this case.
    Aaron P Scooter Software


    • #3

      Nothing, what isn't on the same position but is equal (same word), is shown. Absolutelly nothing is shown. Everything is in red color, even the background. This is because it compares only the same line. What I mean with the same line is - if you imagine horizontal line that starts from the most left margin of whole Beyond Compare's window and ends in the most right margin of the window. Compared are only entries on the same position (in the same line). What I would like to have shown (black color for duplicates = same words) is the comparasion between whole document. Even if the word, for example, ''Bigger Book'' is in the first line on one file and the same entry ''Bigger Book'' is in the 12454th line in the second file, it should show the duplicate (equal words).

      Also, it would be great if ''almost the duplicate'' would be shown. Here are few examples what I mean with ''almost the duplicate'' (on the left part of ''-'' is left file / window, on the right part of ''-'' is right file / window):

      Enter - Enterer
      SideA - Side A
      Something - Something (!!!!!)
      Example - Example (US)
      Exam (UK) - Exam (US)
      [aab - [ aab

      And so on. Not sure if the differences could be ignored from the result of comparasion. I search just duplicates and hopefully the entries that are totally different can be ignored. Also the empty fields (they show that the part of the text from one of the files is missing) can easily be ignored because Im not interested about it. Here are few differences that can be ignored from the result because they are not the same and (for my opinion), not even similar:

      City - Cities
      City - Country
      Book - Plane
      SideA - SideB
      Side C - SideD
      Side E - Side F

      I know its hard to do this because one user can say something is still similar but for another user it isn't. I would be satisfied if just the duplicates would be found, and even if (in latest three examples) the word ''Side'' was shown as a duplicate (equal word with black color), I would still try to survive but what Im getting from BC at the moment is terrible and I really need to fix this.

      Thank you...


      • #4


        • #5

          Beyond Compare considers line breaks important. If your data is on different lines:
          "Enter SideA Something "
          compared to
          Side A
          Then Beyond Compare will match on Enter, but will consider the rest a difference. Order is also important, so if the words are out of order they will be considered different. If this is the case, you would need to run an External Conversion that standardizes the file, placing the line breaks in standard places and sorting the words in the file. We have some examples of this for HTML or XML files, and a KB article for generic External Conversion:

          If you still need further assistance, please send in a pair of example files to [email protected]
          include a copy of your (Help menu -> Support; Export), and a link back to this forum post.
          Aaron P Scooter Software