o NEW FEATURE OVERVIEW
This
update is focused on new business user features, like user collaboration, data
documentation, business information diagramming, sharing, ownership, and faceted
search. They involve new Articles and Diagrams, as well as improvements on
existing (Collections, Worksheets, Dashboards, Presentations) capabilities.
These new features are critical for data shopping, data trust, and data health
applications.
In addition, new usage analytics capabilities have been added
to examine user growth, user search popularity, object inventory growth,
glossary growth, documentation coverage, data classification growth, data
lineage coverage, user collaboration growth, etc.
o NEW FACETED SEARCH AND
WORKSHEETS FOR BUSINESS USERS
The search UI has been fully redesigned for a
better business user experience, including accelerated faceted search filters
fully adjustable for different personas. The filters and faceted search
preferences can be saved by the user and exported into Worksheet for advanced
analysis.
There are numerous other search improvements with additional
adjustable search criteria all aiming at returning the best results first, such
as the use of naming standards (can be different per personas) to find the right
tables when a business user searches by full business name (it also works the
other way around when searching using an acronym). In addition, there is a
search details panel on the right. If you show this panel then you may click on
an object and view its properties in the panel. It is presented much like
the Overview tab in the object page or details page and may be customized when
the Overview tab is customized.
Worksheets provides metadata reporting capabilities where both search and browse (as well as a direct link in the OBJECTS menu) lead to a common worksheet page:
- One may start from search and then migrating to a worksheet allows for simple text filtering as a basis for a worksheet (e.g. customer)
- One may easily start from a category (e.g. database / tables)
- One may save and share worksheets so that other users may quickly reproduce and build on earlier queries/filtering/column selections
- Direct access to a default worksheet does not predefine anything.
See
- New faceted search worksheet layout
- Faceted search filter editing examples
o IMPROVED SEARCH PERFORMANCE
Search is now implemented by a dedicated Solr server rather local Lucene
index files managed by the MM application server. As a side effect, the overall
performance of the MM server and search has been significantly improved. For
example, a Worksheet might have a million row that will be indexed and sorted.
See Solr Indexing
Server.
o IMPROVED USER EXPERIENCE
The UI has been significantly improved for better clarity and user
experience in many areas, in particular:
- Selecting users (or groups) for different use cases (such as filter per user/group, or worksheet sharing to users) has been harmonized and improved for usability (to search) and scalability to a very large number of users.
- Managing Users now offers a paginated UI with filters allowing for much improved scalability.
- User Activity log's UI layout has been redesigned with a new look and feel.
- Object History / change log's UI layout has been redesigned with a new look and feel.
o NEW OBJECT ATTRIBUTE GROUPING STRUCTURE DATA TYPE
This new feature allows for the grouping of object attributes of the same domain/purpose into a
dedicated structure data type. That is the case for the existing data
profiling attributes (e.g. count, distinct, min value, etc.) or any future
operational metadata attributes (e.g. start, end of DI/ETL job executions).
This new grouping structure is not only available for predefined attributes,
but also for custom attributes (Manage > Metamodel) where an
administrator can create an Address structure with Street, City, ZIP.
This
new grouping structure is operational in the MQL language such as "Data Profiling".count and Address.City in Worksheets.
This new feature significantly improves the user experience and readability of a long flat list of object
attributes by offering separate widgets for each grouping structure. More
importantly, this grouping can be used in the UI customization (dashboard) as an
optional widget for just data profiling, operational metadata, etc.
o IMPROVED LOCAL DATA
DOCUMENTATION
The vocabulary used in the data documentation process had
several use cases of "definition", "description", etc., that behaved differently
in the system and were difficult to search across or understand the results in
worksheets with different types of objects.
- This vocabulary has been simplified and harmonized with the following attributes:
· "Name" is unchanged as the physical name (e.g. CUST) of an imported object (e.g. Table) or the actual name of a custom object (e.g. Term).
· "Business Name" is unchanged as the data documented logical name (e.g. Customer) of an imported object (e.g. Table).
· "Definition" replaces and merges "Description" and "Business Description" as the short un-formatted text that defines any object (i.e. can be used as tooltip).
· "Source System Definition" on imported objects replaces any use of "Definition", "Description", or "Comment" in the source system metamodel (profile) imported from data modeling, data integration, and business intelligence tools.
· "Description" replaces "Long Description" as HTML formatted text of unlimited length that can include images, tables, etc.
- Consequently, the default data documentation attributes are defined as follows:
· Any imported object has a Name, a Business Name and a Definition available by default and may also have a Source System Definition, but will not have a Description (Administrators may add it in Manage Metamodel).
· Any custom object only has a Name by default, but does not have a Business Name, Definition or Description (Administrators may add them in Manage Metamodel).
- Note that imported models from logical/physical data modeling tools (e.g. Erwin) have imported objects (e.g. table) that may have both:
· a logical definition of that table called "Definition" or Description" which are now called "Source System Definition",
· a physical definition of that table called "Comment" which comes from the SQL COMMENT concept.
o IMPROVED MAPPED DATA
DOCUMENTATION
Semantic Mapping and Glossary Term Classification have unified
into the single concept of a new "Is Defined By" binary relationship type (see
further below) with instances stored in the new Semantic Mapping now available
as custom models (see further below)
o IMPROVED DATA DOCUMENTATION
EDITOR
As a consequence of the above local and mapped data documentation
improvements, the overall data documentation the documentation editor and
semantic flow lineage tab as been grouped as Defined, Mapped or Inferred:
- DEFINED Locally: uses editable "Business Name" and "Definition" standard attributes.
- MAPPED Semantically uses other objects (e.g., Term in a Glossary) explicitly directly connected with the "Is Defined By" standard relationship stored in a Semantic Mapping.
- Inferred Documentation uses other means of defining documentation in this priority order:
· RELATED Semantically: Documentation from other objects directly connected with a custom relationship (e.g., "Complies To") set to participate in semantic flow (other than the "Is Defined By" standard relationship stored in a Semantic Mapping).
· INFERRED from Lineage: Documentation from object lineage on by pass-through data flows, and semantic flows.
· CLASSIFIED by Term" Documentation from Terms associated with Data Classes (e.g., "Date Of Birth") set to this column or field object.
· IMPORTED Definition: Documentation from read-only "Source System Definition" standard attribute.
· SEARCHED Term: Documentation from Terms with matching names searched in Glossaries.
o NEW HTML DOCUMENTATION EDITOR
The user experience of editing descriptions, comments, or the new articles
and issues (see below) has been dramatically enhanced with a brand-new bundled WYSIWYG (What You See Is What You Get) HTML editor. This
editor is available for any custom attribute of HTML data type. This editor
brings the equivalent of Google Doc or Microsoft Word within this web
application, including all the usual text formatting capabilities, image
management, and even copy/paste with formatting from Word or HTML pages.
o NEW DOCUMENTATION SUPPORT FOR
OBJECT AND USER MENTIONS
In addition, the above newly bundled WYSIWYG HTML
editor (of descriptions, comments, and articles) has been enhanced to support mentions to users (e.g. @John) and objects (e.g.
@Customer). Users creating new object or user mentions benefit from
automatic assistance to auto-complete or more sophisticated search to find the
right user or object. Existing mentions are automatically maintained within the
documentation upon any renaming of the mentioned object or user.
o NEW ARTICLE OBJECT TYPE
Descriptions can be associated with any harvested object (e.g. imported
table) or custom object (e.g. a glossary term). They now benefit from the above
new bundled WYSIWYG HTML editor with object and user mentions,
but are not intended to be full length documents.
Articles are designed for
business users to develop and collaborate on any kind of documents such as
review reports, change requests, white papers, user guides, overviews, etc. Articles are
implemented by a new predefined object "Article" with an predefined attribute
"content" of HTML data type.
A new pre-installed "Standard Extension
Articles" package allows users to create new models of type "Articles" which
contains the Article object type (just like Glossary contains Terms). Manage
Metamodel allows
one to extend the Article object type with custom attributes or links to
other custom objects. Articles benefit from the same capabilities as any other
custom objects including search (MQL), security, as well as the ability to have
comments, mentions, and even may operate under workflow.
o NEW ISSUE OBJECT TYPE
An
Issue has an HTML based rich
text formatted description that can contain images, tables, and even
mentions of users and objects.
- An Issue also has the classic attributes (e.g. Status, Priority, Assignee, Reporter) and the relationships (e.g. Blocks, Related To, Duplicates) commonly used by issue tracking systems such as Atlassian JIRA.
- A new pre-installed "Standard Extension Issues" package allows users to create new models of type "Issues" which contains the Issue object type (just like Glossary contains Terms). Manage Metamodel allows one to extend the Article object type with custom attributes or links to other custom objects. Issues benefit from the same capabilities as any other custom objects including search (MQL), security, as well as the ability to have comments, and even operate under workflow.
o NEW (MANAGE METAMODEL)
STANDARD OBJECT TYPES
The standard package offers additional predefined
object types to model the existing Data Mappings, Semantic Mappings, and the new
generation Data Models as object types, including:
-
New
relationships as object types (also known as n-ary relationships in ER
modeling or relationship as class in UML) which can have attributes.
These
new relationship as objects are directional (roles can be source or target) and
can optionally carry semantic flow (on all roles in such case).
Note that
these new abstract relationship object types cannot be subtyped in this
release.
There are two types of relationship as objects:
· The "Binary Relationship" abstract object type connects only two objects at the instance level (with subtypes such as the new "Semantic Link" object type).
· The "N-ary Relationship" abstract object type connects more two objects at the instance level (with subtypes such as the new "Classifier Map" object type).
-
New root abstract
object types (required as source/target of relationships that can apply to
any repository object) as listed below:
Note that these new abstract root
object types are virtual (implemented as filters) and cannot be subtyped.
· "Any Object" abstract object type represents any standard, custom or imported object type (as used in the Defines/Is Defined relationship on the new Semantic Link object type).
· "Any Imported Object" abstract object type is a (virtual) subtype of "Any Object" representing only Imported Objects created by import bridges.
-
New base object
types (required for data mappings) as listed below:
Note that these new
abstract base object types cannot be subtyped in this release.
· "Any Classifier" object type represents any database table, file system file, etc. (as used in the source/target relationships on the new Classifier Map and Feature Map object types).
· "Any Feature" object type represents any table column, file field, etc. (as used in the source/target relationship on the new Feature Map object type).
o NEW DATA MAPPING CAPABILITIES AND OBJECT TYPES
-
Data Mappings are now modeled as objects as instances of the new "Data
Mapping" model type (in Manage Metamodel) which includes new object types:
Data Mapping Folder, Classifier Map (with subtypes: Replication Mapping and
Query Mapping), and Feature Map.
These new data mapping objects
benefit from the same capabilities as any other custom objects including search
(MQL), security, as well as the ability to have comments, and even operate under
workflow.
- New Replication Mapping (evolution of Bulk Mapping) allows for replication between tables and files of matching structures supporting automatic update as columns/fields get added/removed. Replication Mapping is supported between tables (of possibly different database servers / technologies such as SQL Server to Snowflake), between files (e.g. CSV), and between databases and files (e.g. load/unload of files as database tables). Replication mapping supports both flat structures (CSV files, RDBMS tables) and hierarchical structures (JSON files and NoSQL structures).
o NEW SEMANTIC MAPPING CAPABILITIES AND OBJECT TYPES
-
Semantic Mappings are now modeled as objects as instances of the new
"Semantic Model" model type (in Manage Metamodel) which includes a new Semantic Link
object type.
These semantic link objects benefit from the same
capabilities as any other custom objects including search (MQL), security, as
well as the ability to have comments, and even operate under workflow.
- New search / worksheet driven semantic mapping editing capabilities.
o NEW DATA MODELING CAPABILITIES
AND OBJECT TYPES
Data modeling can be externally performed with data
modeling tools (e.g. Erwin) that can be imported in MM, and then stitched to a
matching imported database. Alternatively, relational databases could be
imported in Physical Data Model (PDM instead of a regular imported Model) where
local documentation and diagrams could be defined. This PDM capability has been
deprecated as it has been replaced (a few years ago) by the introduction of the
Relationship and Diagram tabs to any imported database enabling users to
automatically detect, define and document relationships, and design ER diagrams.
These data modeling capabilities were still limited to relational databases,
this new release fully redesigned the data modeling capabilities with many new
features:
· Data modeling is no longer limited to relational (RDBMS) databases, but now also supports hierarchical (NoSQL) databases, and object stores (e.g. JSON in Amazon S3).
· Data modeling is no longer limited to a given RDBMS schema (as with data modeling tools like Erwin for PK/FK relationships), but now also support relationships and diagrams between Classifier (tables or file) located anywhere:
⋅ in any catalog or schema of a given database server (multi-model of an imported models).
⋅ in any database models (Customer id of a table in the DW database in Snowflake and the Sales database SQL Server).
⋅ in any technologies (PO number of a table in the DW database in Snowflake and the field of a JSON file in Amazon S3).
- Data Modeling is no longer limited to entity relationships of (any) data stores, but now also supports any standard or custom relationships (defined in Manage Metamodel) which now even includes Classifier Map, Feature Map, Semantic Link and way more. Therefore opening the door to multi-purpose business diagrams (as explained below) involving different types of relationships to illustrate a use case.
-
DATA MODELS AS OBJECTS
As with Data Mappings, and Semantic Mappings,
Data Models are now models as objects as instances of the new "Data Model" model
type (in Manage
Metamodel) which includes new object types: Data Model Folder, Entity
Relationship containing Column Mapping(s), and ER Diagram containing ER Diagram
Object(s).
These new data model objects benefit from the same capabilities
as any other custom objects including search (MQL), security, as well as the
ability to have comments, and even operate under workflow.
-
NEW ER DIAGRAMS
including
new graphical layout and rendering properties on objects and relationships
(colors, icons, fonts , etc.), and multi purpose as:
· as Technical Data Model
Diagrams:
represents the primary use case of ER Diagrams fully replacing the
use of any external data modeling tool for data documentation, and way more
powerful as multi data store and technologies (RDBMS, NoSQL, object stores).
· as Business Use Case Diagrams:
These new diagrams can be more business oriented than a pure technical ER
Diagram by allowing graphical decorations and any additional object and
relationship types (besides joins or PK/FK), such as a Classifier Map, Feature
Map, Semantic Link or any custom relationships to illustrate a use case.
· as Object Navigator/Explorer
Diagrams:
Starting from a given object, users can now graphically
expand/navigate any relationships with various automatic layouts (e.g. flow).
· Not a substitute for Data Flow
and Semantic Flow Diagrams:
Although the new ER Diagrams are multi-purpose
for any relationships between entity/object of any model (as explained above),
they are not a substitute / replacement of the existing critical interactive
analysis diagrams which are:
- Data flow Diagrams for data lineage and
impact analysis,
- Semantic Flow Diagrams for semantic definition
analysis.
- NEW ENTITY RELATIONSHIPS
· supporting any relationship types (besides joins or PK/FK),
· enabling worksheet / bulk editing of relationships, as well as CSV import/export.
-
NEW ENTITIES
(This feature may be released post GA as cumulative
patch)
· Allowing the creation of new entities for conceptual / logical data modeling for Enterprise Data Models or new data store requirements.
o NEW DATA FLOW LINEAGE ANALYSIS
DIAGRAMS
using fewer objects to render much bigger data flow lineage traces,
and allowing:
- to decorate objects with tags (such as sensitivity label or PII), and
- to compare the lineage with previous version of that data flow.
o NEW BUSINESS PROCESS
MODELS
compliant to the Object Management Group
(OMG) Business Process Model and Notation (BPMN) standard (see https://www.bpmn.org) with:
- support for importing BPMN XML diagrams from third-party process modeling tools such as https://www.lucidchart.com/),
- support for linking BPMN diagram's data store objects to actual repository objects such as database model, schema, table, etc.
o NEW REFERENCE DATA MODELS
(This feature may be released post GA as cumulative patch)
- Code set mappings, and more.
o IMPROVED DATA SAMPLING AND DATA PROFILING
- New data request methods: fast "Top" (now the default) vs. "Random" (reservoir sampling when available on the database) vs. "Custom Query" (on selected tables)
- New data request scope: subset of tables defined by a provided MQL (e.g. tables from a set of schemas, or table with/without a user defined data sampling flag)
- New data overwrite protection (on selected tables) to prevent an automatic data import (e.g. when a previous long random sampling had been performed)
- New data import operation independently of the metadata import operation (the option to automatically perform data import post metadata import remains enabled by default) but explicit data import can now be requested by API or scheduled (Manage Schedules).
o IMPROVED HIGH LEVEL SHAREABLE
USER OBJECTS (Collections, Worksheets, Dashboards)
High level user defined
objects (e.g. Collections, Worksheets, Dashboards, or Presentations) now have
more powerful sharing capabilities with the notions of Owners, Viewers, and
Editors available through a user-friendly UI similar to popular cloud object
stores like Google Drive.
Collections, Worksheets, Dashboards, Users and
Groups are now available in the global search and MQL.
o NEW DATA QUALITY
with new
Data Quality tab allowing:
- New import from Data Quality tools:
· From a commercial tool such as the import bridge from Informatica Cloud Data Quality
· From any other unsupported tools (or in house DQ) using the import bridge from the Meta Integration Data Quality (DQ) CSV Format.
- New ability to stitch (connection resolution) Data Quality models with their associated data store models.
- New Data Quality tab in the UI with data quality score widgets and histograms
- New Search / Worksheet reporting on Data Quality info
- New pre-defined conditional labels on data quality in data flow diagrams
o NEW DATA SOURCE ACCESS
HISTORY
New data
source access history attributes (including Popularity Count, Last
Access Users, Last Data Access Date, and Last Updated Date) are now available on
selected supported objects (including database tables and views or BI reports)
of imported models from selected supported tools (such as Snowflake, Google Big
Query, or Tableau).
o NEW TOOL INTEGRATION
with
new Manage > Tool
Integration menu allowing:
- Browser Extension (chrome) tool integration allowing to automatically display MM data catalog information of the objects displayed by the web page of the supported web application tool including:
· Business Intelligence web apps like Qlik Sense, Tableau, Azure Power BI
· Data stores web apps like Snowflake’
· Data Quality web apps like Informatica Cloud Data Quality
- Issue Management tool integration like Jira (Summer 2024)
- Communication tool integration like Teams (Summer 2024)
o NEW USAGE ANALYTICS
A new repository operation "Export
analytics" (that can be scheduled on daily basis) allows to generate usage
analytics from the repository database, API, audit log and search index into CSV
files (by default in $MM_HOME//data/files/mm/analytics). Such files can be
analyzed by the customer BI tool of choice (such as Microsoft PowerBI or
Tableau), an example is provided in $MM_HOME/conf/Template/analytics/demo/demo.pbix). Possible usage analytics currently
include:
- Control over the usage analytics scope (selected configuration, or entire repository) and the interval (Days, Months, Years).
- User growth and login per day
- User search (count, popularity)
- Object Inventory (model count, model types, object count, object types, object growth)
- Glossary (term count and growth)
- Documentation (object with documentation count and growth, top documented models))
- Data Classification (object with data classes count and growth, top data classes, data classes count and growth)
- Data Lineage (object with lineage count and growth, model connection count and growth)
- User Collaboration (count and growth of endorsements, certifications, warnings, comments, and attachments)
o NEW SUPPORT FOR MULTI CATALOG
DATABASE IMPORT
A critical aspect of importing metadata from large servers
is the support for multi-model incremental harvesting where the import bridge
can detect changes and efficiently harvest the subset that has been updated. In
the case of a large BI server, only the models of the changed reports are
imported. In case of a large DB server, only the models of the changed schemas
are imported. Not only this multi-model incremental harvesting is much faster,
but it also minimizes the space needed in the repository (with version and
configuration management license) by reusing the models which did not change.
Currently, most database import bridges require the selection of a single
database catalog, apart from SQL Server that allowed the import of multiple
catalogs at once (in such case all schemas of a given catalog were stored as a
single model).
With this improvement, the database import bridges from
popular large cloud servers like Snowflake, Google Big Query, SAP HANA, Presto,
and Microsoft SQL Server (including on Azure) now provide native multi catalog
support with multi schemas represented as muli-models. This improvement reduces
the number of individual Models Import to configure, reduces the amount of
repository storage needed (with version and configuration management license),
and accelerates the incremental harvesting. In addition, this improvement also
significantly facilitates the automatic metadata stitching (connection
resolutions) at the entire database server level, automatically resolving
changes on the underlying catalogs, and their respective underlying schemas.
Finally, this improvement improves data governance by allowing adding
responsibilities (Add Roles), at any level from the entire server model, down to
any catalog or schema.
o THIRD-PARTY SOFTWARE UPDATES
All third-party & open source software has been upgraded to their latest
versions for bug fixes, improvements, and better security vulnerability
protection. For more details, see Bundled Third-Party Software.
o SECURITY VULENRABILITY UPDATES
Numerous major improvements to resolve any new security vulnerabilities,
including from third party software upgrades like the use of Java 17 (instead of
Java 11 which is no longer supported for security vulnerability fixes).
o PRE UPGRADE REQUIREMENTS
- Same steps as any previous releases.
- Physical Data Model (PDM) have been deprecated (and replaced by the local data modeling) in 10.0 (2018) but remained available in 11.0 (2022). PDM is now officially EOL and no longer available in 11.1, therefore make sure you that any legacy PDM models was migrated as regular (imported) Models prior to this upgrade.
o POST UPGRADE ACTIONS
- Same steps as any previous releases.
- SECURITY VULNERABILITY IMPROVEMENT IMPACT
· REST API Help (Swagger based
MMdoc web app) is no longer enabled/deployed on default installation.
You
must first use $MM_HOME/Setup.sh -we MMDoc -wa MMDoc
on the main MM server, as explained in the REST API Documentation Setup.
· MANAGE > Servers: The
default installation of a remote harvesting agent server only allows for a Local Network connection.
You must first use $MM_HOME/Setup.sh -wa MIMBWebServices on that remote
harvesting server to be reachable by the main MM server.
-
IMPROVED DATA DOCUMENTATION
Update external REST API based application
using MQL involving Description or Long Description.
-
NEW SUPPORT FOR MULTI CATALOG DATABASE IMPORT
Full re-import of the
multi-catalog database (e.g. SQL Server or Snowflake),
surrounding ETL/DI
tools (e.g. Informatica PowerCenter or Talend),
and BI Tools (e.g. Microsoft
PowerBI, Tableau),
before taking advantage of the new multi-catalog
connection resolutions (i.e. stitching and configuration build)
-
IMPROVED SEARCH
See Legacy Local Lucene Index
File Known Limitation, and therefore consider at least Migrating
from Local Lucene Files to Bundled Solr Indexing Server, or better Migrating from Local
Lucene Files to External Solr Indexing Server.
- IMPROVED HIGH LEVEL SHAREABLE USER OBJECTS (Collections, Worksheets, Dashboards)
As a side effect of such big improvement, the URL of Collections, Worksheets, Dashboards have changed. Any hard coding of such URL when manually editing HTML must be updated.