Stitching Connection Name Resolution

Connection names, those names used within a data integration (DI/ETL) or a business intelligence (BI) tool to reference data stores, are often not the same as the names for those same data stored as harvested in the repository. Because of this difference, you will see a different presentation in the lineage overview of the model with connection names, versus what you see in a data flow trace after stitching the connections to their data stores. This data connection name resolution is performed automatically as part of the stitching process and will even present the “proper” schema names (those from the data stare harvest) in the data lineage trace view.

Example

Go to the object page for the AP to Staging ETL process.

 

 

There are two database connections. Note, their names are shortened versions (spaces missing, etc.) of the data stores.

Click on the StagingDW connection:

 

 

<No name> is presented.

The schema name is not known, as it was never specified in the ETL design. This works because the database will simply use the default schema.

 

Now, go back up to the level of the entire ETL model and then click the Data Flow tab:

 

 

Again, you see the shortened names.

Because we went to the Data Flow tab with the entire ETL model open, rather than from a particular object in a database, we are presented with the Lineage Overview, rather than a Lineage Trace. With the lineage overview, you only what is in the ETL model, not the full end-to-end lineage trace.

However, since we have stitched this model to the two data stores, and have the complete (proper) names for the database and schemas, we see these in the lineage trace. Go back to the Staging DW connection and navigate to the Vendor table object page and go to the Data Flow tab.

 

 

The proper names from the data store models are presented in the lineage trace because the lineage trace is not limited to only what is in the ETL model, unlike the lineage overview.