No announcement yet.

Script for data compare of large number of CSV files

  • Filter
  • Time
  • Show
Clear All
new posts

  • Script for data compare of large number of CSV files

    I need to automate a reconciliation of two folders that contain over 50 CSV files.

    I have created a BC3 script that creates HTML output reports. I have found some issues:
    • Generated reports do not include a helpful summary of the counts of matches, approximate matches, mismatches, and the totals. Is there a way to get that or is the situation still the same as in this 2007 post?
    • Comparison and HTML report writing seems to take around 8 minutes. However, if I open the session and Compare Results it only takes 1.5 minutes. Is it possible to accelerate it? If not, is there a way to start BC3 from the command line in a session (which I know I can) AND start the folder comparison automatically (to save the user manually selecting all files and clicking on Compare Results?
    • I am creating rules for each of the files and saving them to a BC3 session. I have discovered that the folders for comparison are stored as absolute paths so that if the session is used on the same directories, but on a different absolute path, then all the rules are lost. E.g. if the rules are defined for C:\temp\side-a\file1.csv and then used on C:\temp-different\side-a\file1.csv then BC3 does not recognise the rules. Is there away to migrate rules e.g. if directory is changed, but the relative path is the same, then the rules would kick in?

  • #2

    1) For a Folder-report, we show the overall status of the file but no details. On the folder level, in the interface or scripting, you can expand subfolders then select diff.files to have a selection of files you wish to report on, then generate a text-report or data-report on the selection. This report type can be a summary, or show the actual text that is different, depending on how it is configured. I recommend using the graphical interface first, select your files (Display Filters to show Differences, then Edit menu -> Expand All, Select All Files), then right click or Actions menu -> File Compare Report and try out the different options here to find the configuration that best meets your needs.

    2) I would expect to see similar performance results. My hunch is that the graphical interface is configured in a way to use filtering or different comparison criteria that is allowing it to run quicker. If you are loading a folder session, you can load it in the graphical interface and scripting to make sure all session settings are applied equally. If you are still having trouble, it would help to send in your from the Help menu -> Support dialog to [email protected]. Let us know which graphical session you are loading, and also include your script.txt, and we'll try to find where they might be configured differently.

    3) This is not supported in the current release; we only support the absolute paths, but relative path support is on our wishlist.
    Aaron P Scooter Software


    • #3
      I should add, as a workaround for the absolute paths, if you have a script copy to an absolute location on your hdd first, then compare these two CompareA\ and CompareB\ locations, the absolute paths will always match. This adds a bit of overhead, but could allow an automated solution to work. The other automated solution is to manually edit the .xml with an exterior script, before calling the automated task. As with any edits to our settings.xmls, I would recommend backing up your settings first; a small typo could result in corrupting the files and make them unusable.
      Aaron P Scooter Software