Data Documentation and Governance with Data Models

Today, data management and engineering, along with the data models that these activities often generate are not as commonly maintained as has been the case historically.  These data models were often used to forward engineer data engineer databases as part of a formal data management database design process.  With the advent of big data, data lakes and self service BI, where the emphasis is on flexibility and rapid reporting development and turnaround, the venerable data management based models are not seen as so useful, and are instead seen to be “getting in the way” of the faster paced world of today.

Nevertheless, these data models are often seen to have a wealth of data documentation. Many organizations eye these models, not for their forward engineering value, but instead to be leveraged for table and column definitions, domains, validation rules, enumerations, etc.  While it is often true that these models will have good and even well curated definitions and other such information, it is also true that data models were never really ideal when it comes to building reusable terminology, standardized definitions, etc.  Instead, a definition for a given column will be repeated (reentered and not reused or shared) for each table in which that column appears.

Thus, it is generally not a good idea to simply import data models and use those for standard definitions and business names. Instead, it behooves the organization to migrate (and standardize/normalize) the wealth of information that these models contain into more future proof and flexible solutions such as a glossary, which retain what is uniquely valuable from these models, such as ER Diagrams.