Announcement

Collapse
No announcement yet.

Ignoring MS Word Control(?) Strings

Collapse
X
 
  • Time
  • Show
Clear All
new posts

  • aussieboykie
    replied
    Unfortunately they are two versions of a confidential commercial document. When I have some spare time I'll see if I can extract sections that can be deidentified but still demonstrate the problem.

    Regards, AB

    Leave a comment:


  • Aaron
    replied
    I'm glad you were able to troubleshoot this issue.

    Would you be able to send the sample files to [email protected] ? It may be an interesting test case for us to look at.

    "
    ...please email us at [email protected] with:
    -a link to this forum post
    -a pair of example files (with even just this line)
    -your support.zip from the Help menu -> Support; Export
    "

    Leave a comment:


  • aussieboykie
    replied
    Further investigation reveals that the choice of File Format has a part to play. The problem occurs if I'm using MS Word Documents Extended but not if I use MS Word Documents. It also appears to be related to the way that Word tables are converted to text. Using the Extended filter, each cell is output as a separate line whereas with the normal (built-in?) filter each table row is output as a single line of text.

    Regards, AB

    Leave a comment:


  • Aaron
    replied
    Hello,

    Toggle on Show Visible Whitespace to see if it is considered the End of Line character or whitespace.

    If you are still having trouble please email us at [email protected] with:
    -a link to this forum post
    -a pair of example files (with even just this line)
    -your support.zip from the Help menu -> Support; Export

    Leave a comment:


  • aussieboykie
    started a topic Ignoring MS Word Control(?) Strings

    Ignoring MS Word Control(?) Strings

    When comparing two very similar versions of an MS Word document I often see some lines that look identical until viewed in Hex. Here's an example:

    ASCII = +14 weeks

    HEX-L = E2 80 8E 2B 31 34 20 77 65 65 6B 73 0D 0A
    HEX-R = E2 80 8E 2B 31 34 20 77 65 65 6B 73 E2 80 8E 0D 0A

    How do I tell BC3 to ignore E2 80 8E, which I assume to be some sort of MS Word control string? I followed the process described in this Knowledge Base article but it hasn't helped. The orphan E2 80 8E strings still appear as important (red) differences.

    Regards, AB
Working...
X
😀
🥰
🤢
😎
😡
👍
👎