To accomplish this, we're going to need to make use of elements of their HTML report, provided with the product, as well as making use of their application's API. But before getting into the weeds, let's take the 1000ft view and look at the data model example I'm going to use to help explain how all this works. This ER/Studio model represents the databases involved in supporting a data warehouse architecture. You can see we have our staging database, data warehouse and data mart:
Figure 1. Data Model Tab of ER/Studio |
On the Data Lineage tab we have a sample database that will be used as a source of data for the data warehouse databases:
Figure 2. Data Lineage Tab of ER/Studio |
In the source system database we have a table called person that were going to pull data from in this example. It has 4 columns which include a numeric identifier and some descriptions for a person:
Figure 3. Person Table from Source Database |
The first destination for the data in this table will be the Staging database in the stage.Person table. In order to associate these 2 tables for lineage purposes, we need to allow The Source database to be available for lineage to the Staging database. On the Data Model tab right click The Staging database in the tree view and click on Data Movement Properties. This will bring up a form that will allow you to pick available sources for the Staging database. Click on the add button and select the SourceSystem database:
Figure 4. Data Movement Sources/Targets for Staging |
This will make SourceSystem table columns available to be used as a source by tables in the Staging database. In Staging, let's double click on the stage.Person table, then double click the PersonID column. On the bottom of this form you should see a Data Lineage tab(You may need to scroll to the right to see it). On that tab click the edit button to expose the Edit Source Mappings form for the column. This is where we can choose the source(s) for the PersonID column in stage.Person:
Figure 5. Edit Source Mappings for stage.Person |
Here we check the PersonID column from the source database person table. You'll notice in the transformation logic text box I put the owner of the table. This is because in the objects offered by the ER/Studio API, the one for data lineage does not contain this attribute. Because we can have multiple tables with the same name in the same database, but with different schemas, we need to give the application a way to uniquely identify this source column. For the transformation description, since we are just doing a straight copy from source, I put "None" for transformations performed. Click OK. This will bring you back to the Data Lineage tab. You'll notice the information on the Direct Sources sub-tab will now be filled in. In addition to this tab you should also see tabs labeled "Direct Targets" and "Secondary Targets". These you can ignore. Embarcadero implicitly expects you to fill in children for the column. For this solution this is not necessary, we will make use of recursion to derive ancestors and children for a particular column. All we need to worry about is entering a column's direct ancestor(s).
This process can be repeated for the DataWarehouse, sourcing the Staging database for it's sources:
Figure 6. Person Data Vault Model |
Here we decompose the person table, using a data vault modeling approach, to create a person hub with associated satellites. For each of these tables we repeat the steps we did for the stage.Person table. Here is an example using the LastName column in the satellite.PersonLastName table:
Figure 7. Edit Source Mappings for satellite.PersonLastName |
The last stop for data will be a conformed dimension in our data mart for person:
Figure 8. Person Dimension in DataMart Database |
You'll notice a new column has appeared that was not there before, FullName. This is to demonstrate what happens when a column has more than 1 source. On the Edit Source Mapping for this column we have to do something slightly different to record lineage:
Figure 9. Edit Source Mappings for dims.DimPerson.FullName |
The first thing that should stand out is the Transformation Logic text box. Here we have a comma separated list of schemas for the tables selected in the order they are added as sources. We have 3 tables, thus we enter the schemas they belong to in order they appear:
Figure 10. dims.DimPerson Table Editor |
Now the hard part is over. We did our data entry and are ready to start getting into how to publish our data lineage. But first, we need to produce an HTML report for this ER/Studio model, because our data lineage reporting solution needs elements produced by this. By right clicking on any of the databases on the Data Model tab we can get a menu that will allow us to produce reports:
Figure 11. Database Menu |
This will bring up a wizard that will guide you through generating reports for your model, make sure to select HTML report for your report type:
Figure 12. ER/Studio Data Architect Report Wizard |
When you complete the wizard and the reports are generated you should see this in your target directory:
Figure 13. HTML Report Root Folder |
In the C:\DataWarehouseReport directory, create a new subdirectory called DataLineage. This will be our home for our data lineage HTML report. In this directory create 3 files index.htm, top.htm and tree.htm with the following code:
index.htm:
<frameset
rows="10%,90%">
<frame
src="top.htm" name="top">
<frame
src="tree.htm" name="bottom">
</frameset>
top.htm:
<html>
<body bgcolor="#ffffff" topmargin=0 leftmargin=0
link="#0000cc" vlink="#006666">
<table
width="744" cellpadding=1 cellspacing=1 bgcolor="#ffffff" border=0>
<tr
valign="top">
<td
width="1" align=left valign=center bgcolor="#ffffff"></td>
<td
width="80" align=left valign=center bgcolor="#ffffff">
<a
href="http://www.embarcadero.com/" target="_top"><img
src="../Support/Gif/Embarcadero.gif" alt="Link"
border=0></a>
</td>
<td
bgcolor="#ffffff" width="600">
<table
width="650" bgcolor="#ffffff" border="0">
<tr
bgcolor="31639c">
<td
width="20" bgcolor="#ffffff"> </td>
<TD
ALIGN = LEFT WIDTH="600" bgcolor="#ffffff">
<B><FONT
SIZE=5 FACE="Arial" color="darkblue">ER/Studio Data
Architect HTML Data Lineage Report</FONT></B></TD>
</tr>
</table>
</td>
</tr>
</table>
</body>
</html>
<html>
<head>
<style>
BODY {background-color: white}
TD {font-size: 9pt;font-family:
Arial;text-decoration: none;white-space: nowrap;}
A {text-decoration: none;color: black}
</style>
<script
src="../Support/ua.js"></script>
<script
src="../Support/ftiens4.js"></script>
<script
src="TreeNodes.js"></script>
</head>
<body
topmargin=16 marginheight=16>
<script>initializeDocument()</script>
</html>
Finally, we build the program that will pull the metadata out of ER/Studio Data Architect and create our data lineage report. For this I created a C# program using the .net 3.5 library since the dll provided by Embarcadero for COM will not work in any higher .net version. I built a C# console application for this, but it can just as easily be refactored for a form based project as well. In your project we need to make sure we add the dll so we have access to the COM objects necessary to pull this off. In visual studio go to Project-->Add Reference-->COM. Make sure ER/Studio Type Library is there:
Figure 14. Reference Manager |
If not, THIS WILL NOT WORK. Embarcadero tech support can help you get this installed if it wasn't installed with ER/Studio. This program requires 2 arguments, The file path to the dm1 file for your ER/Studio diagram and the file path for where you want to push the data lineage HTML report to:
Figure 15. Command Line Arguments |
Here is the code for the program:
using System;
using System.Collections.Generic;
using System.Text;
using ERStudio;
using System.IO;
using System.Linq;
using
System.Text.RegularExpressions;
namespace DataLineage
{
class DataLineage
{
//Erstudio application object
private ERStudio.Application _app;
//Erstudio diagram object
private ERStudio.Diagram _diag;
//The file path to the erstudio diagram
(local path)
private string _outPutFilePath;
//Class to hold attributes of column children
private class Child
{
public int key;
public int child;
public string childName;
public string transformDesc;
}
//Collection that contains column
ancestors
private Dictionary<int, ERStudio.AttributeObj> _Ancestors = new Dictionary<int, AttributeObj>();
//Collection that contain column children
private Lookup<int, Child> _Children;
//Initial List of children needed to load
lookup
private List<Child> _Child = new List<Child>();
static void Main(string[] args)
{
//The file path to the erstudio diagram
(local path)
string diagramPath = args[0];
//The file path to the .js file for
output
string filePath = args[1];
//Pass arguments
DataLineage my_app = new DataLineage(diagramPath, filePath);
}
/// <summary>
/// This handles the main program control
flow
/// </summary>
/// <param
name="diagramPath">The file path to the erstudio diagram (local path)</param>
/// <param name="filePath">he file path to the .js file
for output</param>
public DataLineage(string diagramPath, string filePath)
{
this._app = OpenApp();
this._diag = OpenDiagram(diagramPath);
this._outPutFilePath = filePath;
//Loads erstudio objects into memory
LoadERStudioObjects();
//Prints out the .js file
PrintJavascriptNodes();
//Closes ErStudio
CloseApp();
}
/// <summary>
/// Loads columns from the ERStudio model
into C# dictionary
/// </summary>
private void LoadERStudioObjects()
{
Console.WriteLine("Loading ERStudio
objects into memory");
//Loop through models
foreach (ERStudio.Model md in _diag.Models())
{
if (md.Name.ToString() != "Logical")
{
Console.WriteLine("Loading " + md.Name + " tables");
//Loop through tables
foreach (ERStudio.Entity ent in md.Entities())
{
Console.WriteLine("Loading " + md.Name + "." + ent.TableName + " attributes ");
//Load the columns into a
collection if they have lineage
foreach (ERStudio.AttributeObj attr in ent.Attributes())
{
//Load Ancestors
if(Convert.ToInt32(attr.DataMovementColumnLinks().Count)
> 0)
{
_Ancestors.Add((md.Name + "." + ent.Owner +"." + ent.TableName + "." +
attr.ColumnName).ToLower().GetHashCode(), attr);
int i = 0;
string[] transformationLogic = Regex.Replace(attr.SourceDirectTransformationLogic.ToString(),
@"\r\n?|\n", " ").Split(',');
string transformationDesc = Regex.Replace(attr.SourceDirectTransformationDescription.ToString(),
@"\r\n?|\n", " ").Replace(@"""", @"\""").Trim();
//Load Children
foreach (DataMovementColumnLink dcl in
attr.DataMovementColumnLinks())
{
Child cd = new Child();
string owner = "";
//Protection from wrong
schema or no schema entry or no source column entry
if (transformationLogic.Count()
>= (1) && dcl.SrcTrgtColumnName !=null &&
dcl.SrcTrgtColumnName.ToString()!="")
{
try
{
owner = transformationLogic[i].Trim();
}
catch(Exception e)
{
Console.WriteLine("cant write
ancestor");
}
}
//Ancestor hash
cd.key =
(dcl.SrcTrgtModelName + "." + owner + "." + dcl.SrcTrgtTableName + "." +
dcl.SrcTrgtColumnName).ToLower().GetHashCode();
//Child hash
cd.child =
(dcl.PhysicalModelName + "." + ent.Owner + "." + dcl.PhysicalTableName + "." +
dcl.PhysicalColumnName).ToLower().GetHashCode();
//Child Name
cd.childName =
dcl.PhysicalModelName + "."+ ent.Owner + "." + dcl.PhysicalTableName + "." + dcl.PhysicalColumnName;
//Child transformation
description
cd.transformDesc
= transformationDesc;
_Child.Add(cd);
i++;
}
}
}
}
}
}
//Load children into a lookup for faster
searches
_Children = (Lookup<int, Child>)_Child.ToLookup(c =>
c.key);
}
/// <summary>
/// This method prints out the javascript
nodes
/// </summary>
private void PrintJavascriptNodes()
{
//Create/overwrite existing file
if (File.Exists(_outPutFilePath))
{
File.Delete(_outPutFilePath);
}
//Write javascript to file
using (System.IO.StreamWriter file = new System.IO.StreamWriter(_outPutFilePath, true))
{
file.WriteLine(" <!-- This
frameset document includes the FolderTree script. Script found in: http://www.treeview.net
Author: Marcelino Alves Martins -->");
file.WriteLine(" ICONPATH =
'../Support/'");
file.WriteLine(" USEICONS = 1");
file.WriteLine("{");
file.WriteLine("foldersTree =
gFld('', '')");
file.WriteLine("foldersTree.iconSrc =
ICONPATH + 'Gif/globe.gif'");
file.WriteLine("Diag_Node =
insFld(foldersTree, gFld('Data Lineage', ''))");
file.WriteLine("Diag_Node.iconSrc =
ICONPATH + 'Gif/data_lineage.gif'");
file.WriteLine("Diag_Node.iconSrcClosed
= ICONPATH + 'Gif/data_lineage.gif'");
//Add database nodes to file
PrintDatabaseNode(file);
//Add data source nodes to
file
PrintOtherSources(file);
file.WriteLine("}");
}
}
/// <summary>
/// Prints out to specified file the
javascript nodes for databases
/// </summary>
/// <param name="file">The file to print to</param>
private void PrintDatabaseNode(System.IO.StreamWriter file)
{
foreach (ERStudio.Model md in _diag.Models())
{
if (md.Name.ToString() != "Logical")
{
Console.WriteLine("Writing " + md.Name + " node");
file.WriteLine("Model_Node_" + md.GUID.ToString() + " = insFld(Diag_Node,
gFld(" + "\"" + md.Name + "\"" + ",
\"\"))");
if (md.Dimensional.ToString()
== "True")
{
file.WriteLine("Model_Node_" + md.GUID.ToString() +".iconSrc = ICONPATH +
\"Gif/dimensionalmodel.gif\"");
file.WriteLine("Model_Node_" + md.GUID.ToString() +".iconSrcClosed =
ICONPATH + \"Gif/dimensionalmodel.gif\"");
}
else
{
file.WriteLine("Model_Node_" + md.GUID.ToString() +".iconSrc = ICONPATH +
\"Gif/physical.gif\"");
file.WriteLine("Model_Node_" + md.GUID.ToString() +".iconSrcClosed =
ICONPATH + \"Gif/physical.gif\"");
}
PrintTableNode(file, md);
}
}
}
/// <summary>
/// Prints out javascript nodes for tables
for a given database
/// </summary>
/// <param name="file">The file to print to</param>
/// <param name="md">The erstudio database model</param>
private void PrintTableNode(System.IO.StreamWriter file, ERStudio.Model md)
{
Console.WriteLine("Writing table
javascript nodes for " + md.Name);
//Pull all tables that exist in Model md
List<ERStudio.Entity> results = new List<ERStudio.Entity>();
foreach (ERStudio.Entity e in md.Entities())
{
results.Add(e);
}
//Order the list by schema then by table
name
IOrderedEnumerable<ERStudio.Entity> orderedResults =
results.OrderBy(s => s.Owner).ThenBy(s => s.TableName);
foreach (ERStudio.Entity t in orderedResults)
{
file.WriteLine("Table_Node_" + t.GUID.ToString() + " =
insFld(Model_Node_" + md.GUID.ToString() + ", gFld(" + "\"" + t.Owner + "." + t.TableName + "\"" + ",
\"\"))");
switch
(t.DimModelTableType.ToString())
{
case "DIMENSION":
file.WriteLine("Table_Node_" + t.GUID.ToString() +".iconSrc = ICONPATH +
\"Gif/dimension.gif\"");
file.WriteLine("Table_Node_" + t.GUID.ToString() +".iconSrcClosed =
ICONPATH + \"Gif/dimension.gif\"");
break;
case "FACT":
file.WriteLine("Table_Node_" + t.GUID.ToString() +".iconSrc = ICONPATH +
\"Gif/fact.gif\"");
file.WriteLine("Table_Node_" + t.GUID.ToString() +".iconSrcClosed =
ICONPATH + \"Gif/fact.gif\"");
break;
case "SNOWFLAKE":
file.WriteLine("Table_Node_" + t.GUID.ToString() +".iconSrc = ICONPATH +
\"Gif/snowflake.gif\"");
file.WriteLine("Table_Node_" + t.GUID.ToString() +".iconSrcClosed =
ICONPATH + \"Gif/snowflake.gif\"");
break;
case "BRIDGE":
file.WriteLine("Table_Node_" + t.GUID.ToString() +".iconSrc = ICONPATH +
\"Gif/bridge.gif\"");
file.WriteLine("Table_Node_" + t.GUID.ToString() +".iconSrcClosed =
ICONPATH + \"Gif/bridge.gif\"");
break;
case "NAVIGATION":
file.WriteLine("Table_Node_" + t.GUID.ToString() +".iconSrc = ICONPATH +
\"Gif/navigation.gif\"");
file.WriteLine("Table_Node_" + t.GUID.ToString() +".iconSrcClosed =
ICONPATH + \"Gif/navigation.gif\"");
break;
case "UNDEFINED":
file.WriteLine("Table_Node_" + t.GUID.ToString() +".iconSrc = ICONPATH +
\"Gif/undefined.gif\"");
file.WriteLine("Table_Node_" + t.GUID.ToString() +".iconSrcClosed =
ICONPATH + \"Gif/undefined.gif\"");
break;
default:
file.WriteLine("Table_Node_" + t.GUID.ToString() +".iconSrc = ICONPATH +
\"Gif/entity.gif\"");
file.WriteLine("Table_Node_" + t.GUID.ToString() +".iconSrcClosed =
ICONPATH + \"Gif/entity.gif\"");
break;
}
PrintColumnNode(file, t,
md.Name.ToString());
}
}
/// <summary>
/// Prints out the javascript nodes for
other sources
/// </summary>
/// <param name="file">The file to print to</param>
private void PrintOtherSources(System.IO.StreamWriter file)
{
file.WriteLine("Other_Folder_Node =
insFld(Diag_Node, gFld(" +"\"Other Sources\"" + ",
\"\"))");
file.WriteLine("Other_Folder_Node.iconSrc
= ICONPATH + \"Gif/other.gif\"");
file.WriteLine("Other_Folder_Node.iconSrcClosed
= ICONPATH + \"Gif/other.gif\"");
//Loop through models
foreach (ERStudio.DataLineageSource ds in _diag.DataLineageSources())
{
Console.WriteLine("Writing " + ds.Name + " node");
file.WriteLine("Other_Node_" + ds.GUID.ToString() + " =
insFld(Other_Folder_Node, gFld(" + "\"" + ds.Name + "\"" + ",
\"\"))");
if (ds.Type.ToString() == "Flat File")
{
file.WriteLine("Other_Node_" + ds.GUID.ToString() +".iconSrc = ICONPATH +
\"Gif/flat_file.gif\"");
file.WriteLine("Other_Node_" + ds.GUID.ToString() +".iconSrcClosed =
ICONPATH + \"Gif/flat_file.gif\"");
}
else
{
file.WriteLine("Other_Node_" + ds.GUID.ToString() +".iconSrc = ICONPATH +
\"Gif/relational.gif\"");
file.WriteLine("Other_Node_" + ds.GUID.ToString() +".iconSrcClosed =
ICONPATH + \"Gif/relational.gif\"");
}
PrintOtherSourceTableNode(file,
ds);
}
}
/// <summary>
/// Prints out javascript nodes for tables
for a given database
/// </summary>
/// <param name="file">The file to print to</param>
/// <param name="ds">The erstudio database model</param>
private void
PrintOtherSourceTableNode(System.IO.StreamWriter file, ERStudio.DataLineageSource ds)
{
Console.WriteLine("Writing table
javascript nodes for " + ds.Name);
//Pull all tables that exist in
DataLineageSource ds
List<ERStudio.DataLineageTable> results = new List<ERStudio.DataLineageTable>();
foreach (ERStudio.DataLineageTable e in ds.DataLineageTables())
{
results.Add(e);
}
//Order the list by schema then by table
name
IOrderedEnumerable<ERStudio.DataLineageTable> orderedResults =
results.OrderBy(s => s.Owner).ThenBy(s => s.Name);
foreach (ERStudio.DataLineageTable t in orderedResults)
{
file.WriteLine("OtherTable_Node_" + t.GUID.ToString() + " =
insFld(Other_Node_" + ds.GUID.ToString() + ", gFld(" + "\"" + t.Owner + "." + t.Name + "\"" + ",
\"\"))");
file.WriteLine("OtherTable_Node_" + t.GUID.ToString() +".iconSrc = ICONPATH +
\"Gif/entity.gif\"");
file.WriteLine("OtherTable_Node_" + t.GUID.ToString() +".iconSrcClosed =
ICONPATH + \"Gif/entity.gif\"");
PrintOtherColumnNode(file, t,
ds.Name.ToString());
}
}
/// <summary>
/// Prints out javascript nodes for columns
for a given table
/// </summary>
/// <param name="file">The file to print to</param>
/// <param name="t">The erstudio table</param>
/// <param name="modelName">The name of the database for
the table</param>
private void PrintColumnNode(System.IO.StreamWriter file, ERStudio.Entity t, string modelName)
{
Console.WriteLine("Writing " + t.TableName.ToString() + " column javascript
nodes");
//Pull all columns that exist in Table t
List<ERStudio.AttributeObj> results = new List<ERStudio.AttributeObj>();
foreach (ERStudio.AttributeObj a in t.Attributes())
{
results.Add(a);
}
//Order the list by column name
IOrderedEnumerable<AttributeObj> orderedResults =
results.OrderBy(s => s.ColumnName);
foreach (AttributeObj c in orderedResults)
{
int columnHash = (modelName + "." + c.GetParent().Owner + "."+ c.GetParent().TableName + "." +
c.ColumnName).ToLower().GetHashCode();
file.WriteLine("Column_Node_" + c.GUID.ToString() + " =
insFld(Table_Node_" + t.GUID.ToString() + ", gFld(" + "\"" + c.ColumnName +"\"" + ",
\"\"))");
if (c.PrimaryKey.ToString() == "True")
{
if (c.ForeignKey.ToString() == "False")
{
file.WriteLine("Column_Node_" + c.GUID.ToString() +".iconSrc = ICONPATH +
\"Gif/pkattr.gif\"");
file.WriteLine("Column_Node_" + c.GUID.ToString() +".iconSrcClosed =
ICONPATH + \"Gif/pkattr.gif\"");
}
else
{
file.WriteLine("Column_Node_" + c.GUID.ToString() +".iconSrc = ICONPATH +
\"Gif/pkfkattr.gif\"");
file.WriteLine("Column_Node_" + c.GUID.ToString() +".iconSrcClosed =
ICONPATH + \"Gif/pkfkattr.gif\"");
}
}
else if (c.ForeignKey.ToString() == "True")
{
file.WriteLine("Column_Node_" + c.GUID.ToString() +".iconSrc = ICONPATH +
\"Gif/foreign.gif\"");
file.WriteLine("Column_Node_" + c.GUID.ToString() +".iconSrcClosed =
ICONPATH + \"Gif/foreign.gif\"");
}
else
{
file.WriteLine("Column_Node_" + c.GUID.ToString() +".iconSrc = ICONPATH +
\"Gif/attr.gif\"");
file.WriteLine("Column_Node_" + c.GUID.ToString() +".iconSrcClosed =
ICONPATH + \"Gif/attr.gif\"");
}
file.WriteLine("Ancestor_Folder_Node_" + c.GUID.ToString() + " = insFld(Column_Node_" + c.GUID.ToString() + ", gFld(" + "\"Ancestors\"" + ",
\"\"))");
file.WriteLine("Ancestor_Folder_Node_" + c.GUID.ToString() +".iconSrc = ICONPATH +
\"Gif/bluefldr.gif\"");
file.WriteLine("Ancestor_Folder_Node_" + c.GUID.ToString() +".iconSrcClosed =
ICONPATH + \"Gif/blueclsd.gif\"");
//Print ancestors to
javascript file
PrintAncestors(file, c, "Ancestor_Folder_Node_" + c.GUID.ToString());
file.WriteLine("Child_Folder_Node_" + c.GUID.ToString() + " =
insFld(Column_Node_" + c.GUID.ToString() + ", gFld(" + "\"Children\"" + ",
\"\"))");
file.WriteLine("Child_Folder_Node_" + c.GUID.ToString() +".iconSrc = ICONPATH +
\"Gif/bluefldr.gif\"");
file.WriteLine("Child_Folder_Node_" + c.GUID.ToString() +".iconSrcClosed =
ICONPATH + \"Gif/blueclsd.gif\"");
Console.WriteLine("Searching for
children for " + c.ColumnName.ToString());
//Print children to
javascript file
PrintChildren(file, columnHash,
"Child_Folder_Node_" + c.GUID.ToString());
}
}
/// <summary>
/// Prints out javascript nodes for columns
for a given other source table
/// </summary>
/// <param name="file">The file to print to</param>
/// <param name="t">The other source table</param>
/// <param name="modelName">The name of the database for
the table</param>
private void PrintOtherColumnNode(System.IO.StreamWriter file, ERStudio.DataLineageTable t, string modelName)
{
Console.WriteLine("Writing " + t.Name.ToString() + " column javascript
nodes");
//Pull all columns that exist in Table t
List<ERStudio.DataLineageColumn> results = new List<ERStudio.DataLineageColumn>();
foreach (ERStudio.DataLineageColumn a in t.DataLineageColumns())
{
results.Add(a);
}
//Order the list by column name
IOrderedEnumerable<DataLineageColumn> orderedResults =
results.OrderBy(s => s.ColumnName);
foreach (DataLineageColumn c in orderedResults)
{
int columnHash = (modelName + "." + t.Owner + "." + t.Name +"." +
c.ColumnName).ToLower().GetHashCode();
file.WriteLine("OtherColumn_Node_" + c.GUID.ToString() + " =
insFld(OtherTable_Node_" + t.GUID.ToString() + ", gFld(" + "\"" + c.ColumnName + "\"" + ",
\"\"))");
if (c.PrimaryKey.ToString() == "True")
{
file.WriteLine("OtherColumn_Node_" + c.GUID.ToString() +".iconSrc = ICONPATH +
\"Gif/pkattr.gif\"");
file.WriteLine("OtherColumn_Node_" + c.GUID.ToString() +".iconSrcClosed =
ICONPATH + \"Gif/pkattr.gif\"");
}
else
{
file.WriteLine("OtherColumn_Node_" + c.GUID.ToString() +".iconSrc = ICONPATH +
\"Gif/attr.gif\"");
file.WriteLine("OtherColumn_Node_" + c.GUID.ToString() +".iconSrcClosed =
ICONPATH + \"Gif/attr.gif\"");
}
file.WriteLine("OtherChild_Folder_Node_" + c.GUID.ToString() +" =
insFld(OtherColumn_Node_" + c.GUID.ToString() + ", gFld(" + "\"Children\""+ ",
\"\"))");
file.WriteLine("OtherChild_Folder_Node_" + c.GUID.ToString() +".iconSrc = ICONPATH +
\"Gif/bluefldr.gif\"");
file.WriteLine("OtherChild_Folder_Node_" + c.GUID.ToString() +".iconSrcClosed =
ICONPATH + \"Gif/blueclsd.gif\"");
Console.WriteLine("Searching for
children for " + c.ColumnName.ToString());
//Print children to
javascript file
PrintChildren(file, columnHash,
"OtherChild_Folder_Node_" + c.GUID.ToString());
}
}
/// <summary>
/// Returns a dictionary of ancestor id's
for passed collection
/// </summary>
/// <param name="links">Collection of ancestors</param>
/// <returns>Dictionary of ancestor ids</returns>
private Dictionary<int, object> GenerateAncestorList(DataMovementColumnLinks links)
{
Dictionary<int, object> ancestorMarker = new Dictionary<int,object>();
int j = 0;
foreach (DataMovementColumnLink ancestor in links)
{
ancestorMarker.Add(j,
ancestor.GUID);
j++;
}
return ancestorMarker;
}
/// <summary>
/// Prints out ancestors for a given column
/// </summary>
/// <param name="file">The file to print to</param>
/// <param name="c">The erstudio column</param>
/// <param
name="ancestorNodeGUID">The identifier for the parent ancestor
node so we can nest</param>
private void PrintAncestors(System.IO.StreamWriter file, ERStudio.AttributeObj c, string ancestorNodeGUID)
{
string ancestorName;
Dictionary<int, object> ancestorMarker =
GenerateAncestorList(c.DataMovementColumnLinks());
//If there are source column(s)
documented
if (Convert.ToInt32(c.DataMovementColumnLinks().Count)
> 0)
{
Console.WriteLine("Searching for
ancestors for " + c.ColumnName.ToString());
string[] transformationLogic = Regex.Replace(c.SourceDirectTransformationLogic.ToString(),
@"\r\n?|\n", " ").Split(',');
string transformationDesc = Regex.Replace(c.SourceDirectTransformationDescription.ToString(),
@"\r\n?|\n", " ").Replace(@"""", @"\""").Trim();
int i = 0;
//Write transformation
description
file.WriteLine("Transform_Ancestor_Node_" + c.GUID.ToString() + " = insFld(" + ancestorNodeGUID + ", gFld(" + "\"Transformation:
" +
transformationDesc + "\"" + ", \"\"))");
file.WriteLine("Transform_Ancestor_Node_" + c.GUID.ToString() + ".iconSrc = ICONPATH +
\"Gif/Transformation.gif\"");
file.WriteLine("Transform_Ancestor_Node_" + c.GUID.ToString() + ".iconSrcClosed =
ICONPATH + \"Gif/Transformation.gif\"");
//Loop through ancestry
while (i < c.DataMovementColumnLinks().Count)
{
//Return object based on id
from dictionary
DataMovementColumnLink ancestor =
c.DataMovementColumnLinks().get_Item(ancestorMarker[i]);
//Find current ancestor
column object in list
ERStudio.AttributeObj result;
string owner = "";
//Protection from wrong
schema or no schema entry
if (transformationLogic.Count()
>= (i + 1))
{
owner =
transformationLogic[i].Trim();
}
ancestorName =
ancestor.SrcTrgtModelName.ToString() + "."+ owner + "." +
ancestor.SrcTrgtTableName.ToString() + "." + ancestor.SrcTrgtColumnName.ToString();
Console.WriteLine("Found " + ancestorName);
file.WriteLine("Ancestor_Node_" + c.GUID.ToString() +
i.ToString() + " = insFld(" + ancestorNodeGUID + ", gFld(" + "\"Ancestor
Column: " + ancestorName + "\"" + ",
\"\"))");
file.WriteLine("Ancestor_Node_" + c.GUID.ToString() +
i.ToString() + ".iconSrc = ICONPATH +
\"Gif/TransSelectInto.gif\"");
file.WriteLine("Ancestor_Node_" + c.GUID.ToString() +
i.ToString() + ".iconSrcClosed = ICONPATH +
\"Gif/TransSelectInto.gif\"");
if
(_Ancestors.TryGetValue(ancestorName.ToLower().GetHashCode(), out result))
{
//If the ancestor of ancestor
also has an ancestor
if(Convert.ToInt32(result.DataMovementColumnLinks().Count)
> 0)
{
//Recursively find ancestor
of current ancestor
PrintAncestors(file, result, "Ancestor_Node_" + c.GUID.ToString() +
(i).ToString());
}
}
i++;
}
}
}
/// <summary>
/// Prints out children for a given column
/// </summary>
/// <param name="file">The file to print to</param>
/// <param name="columnHash">The hashvalue of the column
name</param>
/// <param
name="childNodeGUID">The identifier for the parent child node
so we can nest</param>
private void PrintChildren(System.IO.StreamWriter file, int columnHash, string childNodeGUID)
{
int i = 0;
//If there are children column(s)
documented
foreach (Child cd in _Children[columnHash])
{
//Write transformation
description
file.WriteLine("Transform_Child_Node_" + ((columnHash.ToString()) +
(i.ToString())).Replace("-", "z") + " = insFld(" + childNodeGUID + ", gFld(" + "\"Transformation:
" +
cd.transformDesc + "\"" +", \"\"))");
file.WriteLine("Transform_Child_Node_" + ((columnHash.ToString()) +
(i.ToString())).Replace("-", "z") + ".iconSrc = ICONPATH +
\"Gif/Transformation.gif\"");
file.WriteLine("Transform_Child_Node_" + ((columnHash.ToString()) +
(i.ToString())).Replace("-", "z") + ".iconSrcClosed =
ICONPATH + \"Gif/Transformation.gif\"");
file.WriteLine("Child_Node_" + ((columnHash.ToString()) +
(i.ToString())).Replace("-", "z") + " = insFld(" + childNodeGUID + ", gFld(" +"\"Child Column:
" +
cd.childName + "\"" + ", \"\"))");
file.WriteLine("Child_Node_" + ((columnHash.ToString()) +
(i.ToString())).Replace("-", "z") + ".iconSrc = ICONPATH +
\"Gif/TransGroupBy.gif\"");
file.WriteLine("Child_Node_" + ((columnHash.ToString()) +
(i.ToString())).Replace("-", "z") + ".iconSrcClosed =
ICONPATH + \"Gif/TransGroupBy.gif\"");
//Recursively find children
of current child
PrintChildren(file, cd.child, "Child_Node_" + ((columnHash.ToString()) +
(i.ToString())).Replace("-", "z"));
i++;
}
}
/// <summary>
/// Closes the Erstudio application
/// </summary>
private void CloseApp()
{
//this.app.Quit();
_app.CloseDiagram(_diag.FileName);
_app.Quit();
}
/// <summary>
/// Opens the Erstudio application
/// </summary>
/// <returns>An instance of the Erstudio
application</returns>
private ERStudio.Application OpenApp()
{
ERStudio.Application app = new ERStudio.Application();
//app.HideWindow();
return app;
}
/// <summary>
/// Opens a specific Erstudio diagram
/// </summary>
/// <param name="FileName">The file path to the diagram</param>
/// <returns>An Erstudio diagram object</returns>
private ERStudio.Diagram OpenDiagram(string FileName)
{
ERStudio.Diagram diag =
_app.OpenFile(FileName);
return diag;
}
/// <summary>
/// Opens an Erstudio model
/// </summary>
/// <param name="ModelName">The name of the Model</param>
/// <returns>An Erstudio model object</returns>
private ERStudio.Model OpenModel(string ModelName)
{
ERStudio.Model mod =
_diag.Models().Item[ModelName];
return mod;
}
}
}
Figure 16. DataWarehouse Data Lineage |
Figure 17. Foreign Key Lineage |
You can also derive lineage from foreign keys as well and see the same lineage for that same column in the table where it is a primary key.
Figure 18. Data Mart Data Lineage |
In this screenshot you can see the multiple ancestors for the dims.Person.FullName column, showing that you can access more than 1 ancestor in your report if present:
Figure 19. Staging Data Lineage |
This screenshot shows the data lineage for Staging database, where it has 1 ancestor from the source system and a whole lot of children.
Figure 20. Data Source Data Lineage |
The one thing I would love Embarcadero to add to their API would be the SrcTrgtTableOwnerName to the DataMovementColumnLinkClass:
Figure 21. DataMovementColumnLinkClass |
That would free up the Transformation Logic text box and the owner could be handled in code behind the scenes. Petition Embarcadero...they may do it, if enough people ask for it.