Monday, March 7, 2022

Database Comparison Tool

When moving data from one database to another, whether it is from one database vendor to another or just one server to another, in many cases you want to make sure that all the data was transferred, and nothing was lost in the process. The Data Platform Engineering Team has heard this requirement frequently from customers, so our team has created a tool that enables you to verify that the data on both source and target is exactly the same. With the appropriate drivers, the tool supports SQL Server, Oracle, DB2, Teradata, Netezza, PostgreSQL, and MySQL as sources and SQL Server as a target. Future versions may include support for Azure Database for PostgreSQL and MySQL targets. The target is used to read the schema, so only SQL Server can provide the table and column information.


To reduce the traffic on the wire from the source and the target databases, instead of transferring all the columns for each row, the tool creates a 16 byte MD5 hash of the column values for each row and compares the hash values, source to target, using the primary key. This therefore requires that the table has a primary key or the application will use a rowguid or even a user specified unique key, if necessary. The tool still causes a lot of network traffic and load on the server, so care must be taken as to when and where you run the tool.


The tool is a command line application with no GUI user interface. It is controlled using the app.config file and an optional Excel worksheet. The worksheet allows much finer grain selection of tables to process, including comparing tables with different names between source and target, which columns to ignore, even the option of filtering rows in a table and potentially processing tables in parallel. The tool displays errors and warnings as console output and creates log files for each compared table with mismatches. The tool can also optionally create a report Excel worksheet with a summary row for each table processed.


The tool can be downloaded here. There are additional details in the User’s Guide included in the download.  


If you have feedback or suggestions for improving this data migration asset, please contact the Data Platform Engineering Team. Thanks for your support!

Posted at