Data Reconciliation using Informatica Cloud

Data Reconciliation using Informatica Cloud

Data Reconciliation is the process of verify of the target data against the original source data.

Why Data Reconciliation is needed:-

  • During data migration, data might be lost during extraction from the source due to infrastructure or network issue 
  • Data might be poorly formatted during the transformation
  • Data might be duplicated during the source to the target load 
  • Some data might be rejected while loading into the target 
Some organizations rely on record counts and compare source and destination counts. This is better than doing nothing, but it does not solve the problem completely. If updates are not adequately captured or applied, the record counts may be the same but the data can be drastically different.

For overcoming the above scenarios we need to verify the value in each field that is needed for analysis. 

It will be time consuming to compare all the target fields with source fields when the target has hundreds of columns but the scenario can be easily designed using horizontal macro and MD5 Function in expression transformation.

Process:-

In this case, load happened between SQL Server Tabe to Snowflake Table, so the below mapping performing Data Reconciliation between SQL Server Table and Snowflake Table and loading the result in a Recon Table. WEB_SITE_SK is the primary key in the source 

Mapping Image
Mapping Image


1. Importing Source and Target Table:- 

Import the source or use the source query. better practice to use query as a source, so the columns are not required and can be eliminated on the source side. 
Make source and target columns propagate into downstream transformation in same order 


SQL Server Table Image


Snowflake Table Image


2.  Expression Macro and MD5 Value Generation 


Horizontal Macro function %OPR_CONCAT% has been used to concat all the columns and the MD5 Function generated the hash value on top of this result. 

Generating MD5 Value for SQL Server Image

Generating MD5 Value for Snowflake Image


3. Find the count of each WEB_SITE_SK in-target:-


Using aggregator transformation to aggregate the data on the basis of WEB_SITE_SK and calculating count
                                  
Aggregator Transformation Image

         

4. Joining two results:-


Joiner Transformation Image



5. Calculating the new fields 



Expression Image

Available_exp:- calculating where the WEB_SITE_SK presents


IS_MATCHED_exp:- verifying the generated Md5 value between source and target 
IS_DUPLICATED_exp:- Verifying if the data is duplicated in the target or not 






6. Loading into Recon Table:-






Final Output:-

Final Output Image



Comments