o NEW FEATURE OVERVIEW
This
new major release brings the key Data Governance (DG) solutions on top of the
existing powerful Data Catalog (DC) and Metadata Management (MM) foundations of
previous versions. MM already offers all Technical Models (including their
metamodels and associated MIMB bridges/connectors) for virtually any data store
(file system / object stores / data lakes, RDBMS, NoSQL, DW), Data Integration
(DI) and Business Intelligence (BI) tools and technologies, and the list of MIMB
supported tools keeps growing thanks to the largest ecosystem of partners. The
key feature of this new version is the ability to define and populate Business
Models for data management such as reference data, data quality, data trust,
data security, data sharing and shopping, data issue management, business rules,
business process modeling and improvements, vertical market specific business
applications and regulation compliance. MM is pre-populated with standard
business models, starting with the Business Glossary which can now be fully
extended with custom business objects and associations. The Data Catalog
capabilities have been significantly enhanced with automatic data classification
(machine learning), and now supporting both data classes (previously semantic
types) and metadata classes (metadata query language driven) already
pre-populated to detect and hide most popular Personally Identifiable
Information (PII).
o NEW METAMODEL MANAGEMENT FOR
CUSTOM "BUSINESS" MODELS
Custom "Business" Models can now be defined with
customizable metamodels as needed in many data governance related domains such
as data management, reference data, data quality, data trust, data security,
data sharing and shopping, data issue management, business rules, business
process modeling and improvements, vertical market specific business
applications and regulation compliance.
- Administrators can use a new Manage Metamodel menu to define their custom "business" models with the full power of object modeling, all the way to the graphical editing of UML class diagrams for each business models. See help.
· The modeling starts by defining reusable attributes promoting data standardization among business objects. Such attributes can be of any basic type such as integer, string, date, enumeration, but also more active types like email, web url or phone numbers offering a better user experience (send an email, make a phone call, etc.) See help.
· Custom "Business" objects are
then created based on these reusable attributes, and custom associations can be
created, including regular reference relationships, but also composition links
(UML aggregations), and UML generalization allowing to define abstract business
objects. See help.
Custom "Business" objects have a name and icon that can be be searched from
an expansive bundled library of icons, customized (e.g. change color), uploaded
(from external sources), or even designed in the UI (start from a shape, color,
etc.).
· Finally, custom "business" objects are associated to custom "business" models, ready to be populated. MM is pre-populated with a few standard (system read only) business models (starting with the business glossary model) and a few model extensions. Associations can therefore refer to business objects across different business models. See help.
- Users can then use the UI for data entry, analysis and reporting on such custom "business" models with the same capabilities as with harvested / imported "technical" models, including their use in the Metadata Query Language (MQL), Worksheets and therefore Dashboard. In addition, Business Models are also offered a new Hierarchy tab allowing to drill down hierarchically in both data entry (including bulk edition) and reporting. Workflow can also be applied to the business model, where the objects for business rule, or business policy can go through an elaborate workflow from proposed, draft, approved all the way to publish (and even deprecated) to the end users. See help.
- Integrators have external bulk editing/reporting available through CSV import/export capabilities, as well as REST API, allowing to define actual connectors (bulk or real time sync) with the actual tools / applications behind the business models such as JIRA for the Data Issue Management model, or the customer's custom DQ applications for their Business Rule Model. See help. See help.
o NEW METAMODEL MANAGEMENT FOR
IMPORTED "TECHNICAL" MODEL EXTENSIONS
Imported "Technical" Models are based
upon predefined metamodels associated with MIMB bridges/connectors for virtually
any data store (file system / object stores / data lakes, RDBMS, NoSQL, DW),
Data Integration (DI) and Business Intelligence (BI) tools and technologies.
Such predefined technical metamodels can now be extended for data documentation
purpose with the same Manage Metamodel (admin) UI used to define custom
"business" models.
Therefore, the new Manage Metamodel (admin) UI not only
allows the creation of new custom "business" objects (for the new custom
"business" models), but also the creation of new imported "technical" objects
defined (scoped) as a set technical objects predefined in the import bridge
metamodels. For example a new generic "data field" imported object can be
defined as either a RDBMS table column, a NoSQL JSON field, CSV field.
etc.
Consequently:
- Custom Attributes can now be defined and applied the same way (and therefore reusable) for both imported "technical" objects and custom "business" objects. Not only this eliminates the previous Manage Custom Attributes (admin) UI, but it more importantly avoids redefining the scope of each custom attribute applying to similar imported objects (as it was frequently the case for table/file/entity or for column/field/attribute).
- Custom relationships can now be defined from a custom "business" objects to imported "technical" objects. For example, a new custom model called "business policies" can contain a new custom object called "business rule" which can have an "enforce" custom relationship to an imported object called "data field" as defined above.
- Custom relationships can be be optionally set to be involved in the semantic flow. In the above use case, this allows the semantic flow tab to not only include term definition of a table column, but also include the business rules, all the way to business policies.
- This allowed the implementation of the term classification process (Now called term documentation) with an actual custom relationship "Defines" from "Term" to a new predefined "Imported Object" in the predefined standard metamodel. Although this new term documentation implementation as a relationship has no impact or direct benefits to the user experience, it offers solutions to the continuous changes in technology and architectures. For example, the data documentation (including term documentation) of a well-documented data warehouse on prem (e.g. Teradata) can be exported and reimported to a new implementation of that same data warehouse on cloud (e.g. snowflake).
o IMPROVED DATA DOCUMENTATION
AUTOMATION AND PRODUCTIVITY
The data documentation process of imported
"technical" models is a critical part of any data catalog.
- Any imported object (e.g. tables/files, columns/fields) comes with a physical name which needs documentation with at user friendly (logical) name and description which are now better presented and managed in 4 categories: See help.
· Business Documentation offers a local documentation with a business name and business description. This can be used as an alternative of the term documentation below, or a mean to supersede an existing term documentation with a better local definition.
· Term Documentation (previously called term classification) allows to document any imported object with one or more terms from a glossary (now creating a "Is Defined By" relationship)
· Mapped Documentation allows to document any imported object connected by a semantic mapping with one or more terms from a glossary, or entities/attributes from a data model.
· Inferred Documentation provides data documentation on any imported object automatically generated from other objects involved its data flow pass-through lineage and impact. This is a powerful feature dramatically increasing the automatic data documentation coverage on many data stores (ODS, data lake, DW) of the Enterprise Architecture.
-
The data documentation process and presentation has been considerably
improved in the UI (on any imported object overview tab) with wizards for
business documentation and/or term documentation (with term reuse or creation on
the fly) suggesting business friendly logical names (from naming standards and
supervised learning when enabled) from physical names, and description from
inferred documentation.
New "Term Documentation" and "Inferred Documentation"
attributes are available in the REST API, MQL, and therefore worksheets and
dashboards allowing to create KPI graphical widgets on the data documentation
coverage.
o NEW DATA CLASSIFICATION
is
a critical part of data cataloging automation and therefore received major
enhancements from the previous concept of Semantic Types now renamed Data
Classes:
- New Data Classes of type "Metadata" which is a metadata driven classification process powered by the Metadata Query Language (MQL) allowing one to detect classes by metadata name (e.g. field / column/ attribute name) which is critical to detect many PII that cannot be detected by data sampling/profiling such as as maiden name, date of birth or place of birth. See help.
- Improved Data Classes of type "Data" which is the classic data sampling driven data classification process based on: See help.
· Enumeration such as a list of codes / values,
· Patterns such as a US SSN with 999-99-999
· or regular expressions such as a US ZIP with ^[0-9]{??5}??(-[0-9]{??4}??)?$
- In addition, Data Classes now benefit from the following new features:
· new control over matching threshold and uniqueness threshold.
· new machine learning based automatic discovery of data class patterns or enumerations (e.g. automatically learning new code values)
· new server side re-classification on demand (e.g. after adding new data classes) therefore no longer requiring one to perform a new data sampling / profiling to take advantage of the new data classes.
- Improved Data Classes of type "Compound" (e.g. PII) based upon multiple data classes of data detection type (e.g. SSN) or metadata detection type (e.g. Date of Birth) allowing one to hide PII within any data sampling / profiling without any machine learning or customization to start with. See help.
- MM is now pre-populated with PII data classes of type data (as previously for SSN), but also new PII data classes of type metadata (e.g. Data of Birth), and new PII data classes of type compound combining all types of PII data classes. See help.
- Redesigned the data classification architecture to be processed on the MM server side allowing for on demand / refreshed automatic data cataloging (e.g. after new data classes are created/updated).
o NEW OBJECT SENSITIVITY LABELS
- A new Manage Sensitivity Labels (admin) UI has been created to define sensitivity labels as an ordered flat list such as: Unclassified > Confidential > Secret > Top Secret. Each sensitivity label has a description, a hide data property (only used when applied to a column/field), and a color (for example confidential can be orange and top secret red). By default, no predefined sensitivity labels are defined, this means that this feature is disabled by default. See help.
- Sensitivity labels can be manually applied by authorized users (with a role that includes the Data Classification capability) to any individual object from an entire model, a report, a schema, table, all the way down to a column. Note there is no inheritance such that setting up a schema secret does not make each of its tables and respective columns as secret. Sensitivity labels can also be set in bulk (e.g. multiple columns at the same time). See help.
- Sensitivity labels can automatically be set as "Sensitivity Label Data Proposed" through the automatic data classification detection. For example, a data class SSN can be associated to a sensitivity label called Confidential or GDPR. In such case, any table columns or file fields detected as SSN will also automatically be set with that Confidential or GDPR sensitivity label. Note that in such case the approval process of data classes also applies to sensitivity labels. In addition approving a data class detection on a given object also approves its associated sensitivity label. See help.
- Sensitivity labels can be automatically inferred as "Sensitivity Label Lineage Proposed" following the data flow lineage (similar to the Inferred Documentation concept), but instead going through any data flow (with transformation or not). This powerful new solution allows automatic sensitive label tagging across the enterprise architecture, and has been implemented and optimized through a server cache detecting any data flow changes in the configuration. As with "Sensitivity Label Data Proposed", the "Sensitivity Label Lineage Proposed" can be rejected, therefore stopping the propagation of inferred sensitivity level in that data flow direction. Note that the propagation of inferred sensitivity level is also stopped by any data masking discovered within the ETL/DI/Scrip imports involved in that data flow. See help.
- Sensitivity labels are highly visible in the UI (at the top of any object overview), and can be query through MQL (in the UI or REST API). Applications can be built to query these sensitivity labels in order to automatically generate / enforce data security on the data stores (e.g. databases or file systems with Rangers). Note that sensitivity labels do not directly set or bypass the role based security of the MM repository, or automatically hide data from the MM repository (these actions can be set separately in MM). See help.
o NEW OBJECT CONDITIONAL LABELS
- A new Manage Conditional Labels (admin) UI has been created to define conditional labels based on the Metadata Query Language (MQL) such as "Highly Commented" based on objects with over 10 comments. See help.
- Each conditional label has a name and icon that can be searched from an expansive bundled library of icons, customized (e.g. change color), uploaded (from external sources), or even designed in the UI (start from a shape, color, etc.). See help.
- Conditional labels are visible in the overview page of any object. See help.
- Conditional labels can be displayed in search results and worksheets. See help.
- Conditional labels can be displayed in data flow lineage diagrams.
o NEW OBJECT WATCHER AND EMAIL NOTICATION
- The Manage Emails (Admin) UI (used to setup notification emails) has been extended to enable the Watcher capabilities at the server level (with an adjustable frequency where the server will check every 15 minutes by default but can be set to hourly or more to avoid loading too much the server). See help.
- The Manage Object Roles (Admin) UI has been extended with a Watcher Editor capability (allowing a user to start/stop watching an object), and a Watcher Manager capability (allowing to add/remove anyone as a watcher of an object) See help.
- The Manage Users (admin) UI has been extended to set the watcher notification frequency (of a given user) from daily (default), never (i.e. turned off), to near real time (as setup at the server level). Note that the same watcher frequency can also be setup by each individual user on their top right menu for user profile / preferences. See help.
- With all above Watcher capabilities configured, users can now see a watcher icon ("eye") on the object overview page at the top right (next to the sensitivity label, and endorsement icons and menus). The watcher icon shows the count of watchers on that object and offers menus for the user to start/stop watching that object, and possibly add/remove other user watchers if authorized (with with Watcher Manager capability). See help. A new "Watchers" attribute is available that can be used on search, MQL, or for UI customization. See help.
- The Watcher capabilities are supported on both imported "technical" models and custom "business" models. However, the watcher capabilities are available at the model level only (e.g. not down to just a column or a term object). In case of imported "technical" models harvested as multi-model, one can watch the entire multi-model (e.g. entire database server), or individually watch any desired sub-model such as a given schema of PostgreSQL or a given Workbook of Tableau. See help.
- Watchers of imported "technical" models receive a separate email per model and per type of activity as follows: See help.
· Any metadata harvesting driven changes at any level (e.g. add/delete/update of any schema/table/column/type) as soon as (in real time) an import (incremental harvesting) is successful with changes, or failed. In such case, the watcher notification email includes a change summary statistics (e.g. number of added, deleted, updated objects), and a MM server URL link to its model version comparator report for full details.
· Any other changes such as data documentation (e.g. business name, description, or term classification), social curation, etc. at any lower level (e.g. table, column, data type) as often as defined by the server or a user. In such case, the watcher notification email includes a change summary statistics (e.g. number of changed objects), the top 5 changed objects (with a MM server URL link to the overview page of that object), and finally the detailed changes (with a MM server URL link to the search UI filtered by the content of that model and ordered by last modified),
- Watchers of custom "business" models, Data Mapping, Semantic Mapping, Physical Data Models receive a separate email per model on any change at any level. See help.
· Any changes at any level (e.g. add/delete objects, update attributes, add/delete relationships, etc.) as often as defined by the server or a user. In such case, the watcher notification email includes a change summary statistics (e.g. number of changed objects), the top 5 changed objects (with a MM server URL link to the overview page of that object), and finally the detailed changes (with a MM server URL link to the search UI filtered by the content of that model and ordered by last modified),
- Independently of the above watcher capabilities, other notification emails are also sent to users based on his/her roles/capabilities, including: See help.
· Workflow transitions on objects where the user has a workflow role, as often as defined by the server or a user
· Configuration changes (e.g. add/remove model, edit connections) as often as defined by the server or a user, or build errors (in real time) on configurations where the user is a repository manager of that configuration.
· Server errors (e.g. server down) in real time to the user with Application Administration capability.
o NEW OBJECT ROLES & GLOBAL ROLES
-
User roles are no longer pre-defined or hard coded, but instead custom
built upon an extensive set of elementary capabilities, such as metadata
viewing, data administration, workflow editing, etc.
User roles and
capabilities are either:
· Object Roles: such as a Curator may have the capability to Comment, Warn or Endorse the metadata in contents to which they are assigned. See help.
· Global Roles: such as a Security Administrator could have the capability to update group and user role and object assignments. See help.
- Administrators can associate users or group of users with object or global roles, this association is referred to as a responsibility. This way, one may quickly assign roles and those associated responsibilities to individual users or entire groups of users, as needed. See help.
- Data governance, data cataloging, data administration, metadata management, data quality, etc., have a variety of "flavors", best practices, roles and responsibilities. Nevertheless, MM is pre-populated with a set of essential roles (such as Content custodian, Data Owner, etc.) that can then be customized. Because MM is so flexible in implementation and nearly infinitely customizable, one may tweak or even re-engineer the groups and roles to fit the very specific needs of a given organization.
· A set of global roles to provide high-level administration of the metadata management environment. See help.
· A set of commonly used object roles to allow assignment of all of the capabilities available in the product. See help.
· More specific global and object roles tailored to specific metadata management activities, scenarios and use cases as identified in the user guide.
· Specific global and object roles tailored to specific modeled business processes.
· A RACI (Responsible, Accountable, Consulted, Informed) based example of global and object roles and their assignment.
o NEW WORKSHEET ATTRIBUTES
- The "Stewards" attribute have been moved to the new concept of roles. Therefore, Stewards has been moved from the attribute sheet widget of the Overview tab to a new dedicated Responsibilities tab. Note that that a new widget for Responsibilities is also available for anyone to add in the Overview tab if desired.
- The "Used" attribute has been renamed to "Has Semantic Usage". More new lineage attributes are also available: "Has Semantic Definition", "Has Data Lineage", and "Has Data Impact". This allows to detect unused objects. All these attributes can also be used as filters.
- The "Semantic Types" attribute has been renamed "Data Classifications".
- The "Inferred Semantic Types" widget of the Overview tab has been moved to a new attribute "Data Classifications Matched". More new data classification attributes are also available: "Data Classification Rejected", "Data Classification Approved".
- The "Term" attribute has been renamed "Is Defined By" and is now a list of Terms instead of one.
- A new "Term Documentation" attribute shows the list of terms (name and decription) documenting the object (it can also be used as a filter).
- A new "Mapped Documentation" attribute shows the list of semantically mapped objects (name and decription) documenting the object (it can also be used as a filter).
- A new "Inferred Documentation" attribute shows the list of terms (name and decription) indirectly documenting the objects through its pass-through data lineage / impact (it can also be used as a filter).
- A new "Documentation" attribute shows the summarized documentation of the object. The summarized documentation returns the first documentation found on the object following the following priority: Business Documentation > Term Documentation > Mapped Documentation > Inferred Documentation > Imported (Documentation) > Searched (Documentation). This attribute can also be used as a filter.
- "Business Name Inferred", "Business Name Inferred Origin", "Business Description Inferred", "Business Description Inferred Origin" attributes have been deprecated (but still available in this release) as they have been replaced by the new "Documentation" attributes.
- The "Documentation" attribute of glossary terms has been renamed to "Long Description" to not conflict with the new "Documentation" attribute described above.
- New "Data Profiling" attributes have been added "Data Profiling"."Distinct", ."Duplicate", ."Empty", ."Valid", ."Invalid", ."Min", ."Max", ."Mean", ."Variance", ."Median", ."Lower Quantile", ."Upper Quantile", ."Avg Length", ."Min Length", ."Data Profiling", ."Max Length", ."Inferred Data Types".
- The "Certifications", "Endorsements", "Comments", "Warnings" attributes have been renamed to "Certified By", "Endorsed By", "Commented By", "Warned By". In addition to previously supporting filtering, they can now be used as columns showing the list of users that "Certified", "Endorsed", "Commented" or "Warned" the object.
- The "Endorsement Count", "Comment Count", "Warning Count" attributes have been added to the list of possible filters, allowing to produce worksheets/dashboards with popular objects and more.
- The "Certified" attribute was added to the list of filters, again for data governance worksheets/dashboards.
- The "Parent Object Name" and "Parent Object Type" attributes have been added.
-
Object roles can be used as columns or filters
- filter example:
expandedMembersOfRole('Steward') = ANY('Business Users')
- select example:
membersOfRole('Steward')
- Object relationships/children can be used as columns.
- Term's workflow "Status" and "State" attributes have been changed into the different and more generic attributes "Workflow State", "Workflow Published", and "Workflow Deprecation Requested" that now apply to any object of a user model under workflow.
- The "Last Modified Date" and "Last Viewed Date" attributes have been renamed to "Updated Date" and "Viewed Date"
- The "Created Date", "Created By", "Updated By" attributes have been added (also available as filters). "Created Date" and "Created By" attributes only apply to non-imported objects
o NEW WORKSHEET FEATURES
- In addition to sorting by Name and Relevance, new ability to sort by "Updated Date" in search, worksheets and object explorer (ORDER BY "Updated Date" in MQL)
o NEW CLOUD IDENTITIES
MANAGEMENT FOR METADATA HARVESTING
New MIMB infrastructure allowing password
parameters to be based on external (MM managed) cloud identity (on Amazon Web
Services, Google Cloud, or Microsoft Azure) where the Secret / Password
parameter can be:
- A secret identifier which is a URL to a cloud identity secret vault's actual secret (allowing for external storage of such secret / password in a cloud secret vault).
- Empty (no longer mandatory) and the authentication is based on the cloud identity on select bridges (such as Microsoft Azure Data Lake Storage, Microsoft Azure Blob Storage, and more to come).
- Note: early releases provided this feature as MANAGE > Secret Vault. See upgrade details below.
o IMPROVED METADATA REPORTING AND PRESENTATION
-
with new graphical widgets (e.g. Responsibilities, Relationships) on
object page presentations (e.g. Overview).
See help.
- Manage Default Presentation now supports import/export between servers.
o IMPROVED METADATA QUERY
LANGUAGE (MQL)
no longer needing the special character syntax on attributes,
and added support for a lot more "system" objects related to data sampling, data
profiling, data classification, user roles, and workflow actions dramatically
extending the power of the metadata reporting (worksheets and dashboards), and
even superseding the hard coded implementation of menus like My Workflow
Actions, or My Term Changes now renamed My Changed Objects which is full
customizable with MQL.
For more details on the MQL changes, see the MQL new
or improved features, deprecated
features, and removed
features.
o IMPROVED REST API
especially in support of the new features of this version such as
classification, object and global roles.
For more details on the MQL
changes, see the new
or improved API methods, deprecated
API methods, and removed
API methods.
o IMPROVED ADMINISTRATION UI
such as Manage Users, Groups, Roles. Classification for a more harmonized
and intuitive look & feel and more improved editing capabilities. removed
API methods.
o CHANGES FROM PREVIOUS VERSIONS
- Custom Attributes are now defined within Manage Metamodel.
· All licenses allow to edit custom attributes within Manage Metamodel, however defining new custom objects, relationships and models require a specific license.
· The repository database upgrade includes a "Migrate Custom Attributes to Metamodel" operation (see Manage Operations for log).
- Glossaries are now defined within Manage Metamodel.
· The "Standard" package includes a new Glossary metamodel with new KPI and Acronym objects, but removes the Category object which was the only (previous version) way to create a Term hierarchy. Terms can now contain Terms, and no longer require the creation of a Category.
· The repository database upgrade includes a "Migrate Glossaries to Metamodel" operation (see Manage Operations for log).
⋅ The (previous version) Glossary Categories are now migrated as Terms without loss of the hierarchy at the instance level (i.e. glossary term path). Any MQL use of Categories has been updated to Terms as part of the migration. Note that count of Terms might be higher as Categories are Terms.
⋅ The (previous version) Glossary Term predefined (hard coded) relationships (e.g. More General/More Specific, Contains/Contained by, Represents/Represented by, References/Referenced by, See Also) are migrated as part of an optional "Glossary Extension (MM)" package without loss. However, these relationships are not a mandatory part of the "Standard" package, as users can now define better custom relationships with appropriate names.
⋅ The (previous version) Glossary Term had 2 workflow related properties:
⋅ Status has been renamed "Workflow State" with the same values (Candidate, Draft, Under Review, Pending Approval, Approved, Published, and Deprecated).
⋅ State (Deprecated, New, and Published) is replaced by "Workflow Published" (True/False) and "Workflow Deprecation Requested" (True/False).
⋅ The (previous version) Glossary Term Abbreviation attribute are migrated to a new "Has Acronym" relationship to a new "Acronym" object. Note that Alternative Abbreviation is not migrated.
⋅ The (previous version) Glossary Categories and Terms were used to implement naming standards, which are now implemented by a "Naming Standards" model which contains "Naming Standard" objects containing "Naming" objects. When enabling naming standards, users now have to select which "Naming Standard" object they want to use (instead of a glossary category). The repository database upgrade includes a "Migrate Glossaries to Metamodel" operation which also migrate any Categories and Terms used for naming standards into new objects of the "Naming Standards" model.
· The metadata harvesting browse path (in import/export bridge parameters) is no longer defined as * by default (allowing to browse any drives, directories and files) for obvious security vulnerability reasons. Administrators must use the Setup UI or command line to define the scope of file browsing.
· The REST API help (MMDoc.war) and any other Tomcat Web Apps are no longer enabled by default for security vulnerability reasons (Swagger unauthenticated sensitive endpoints). They have been moved from $MM_HOME/tomcat/webapps to $MM_HOME/tomcat/dev. If desired, these webpass can be enabled with the Setup UI or command line as follows $MM_HOME/Setup.sh -we mmdoc. This will create the context MMDoc.xml in $MM_HOME/tomcat/MetaIntegration/localhost to make the webapp available and start it.
· The dashboards can no longer store and execute user defined java scripts for security vulnerability reasons (to mitigate the XSS vulnerabilities). Consequently, a new third-party (DOM Purify) strips all XSS properties when rendering the custom HTML of the HTML widget. Therefore, if the HTML contains any tags that are not purely formatting tags (font color, size, images,.), then they will be removed automatically before displaying the HTML. In addition, all HTML attributes allowing to enter javascript (onclick, onload,.) are also be removed.
o THIRD-PARTY SOFTWARE UPDATES
All third-party & open source software has been upgraded to their latest
versions for bug fixes, improvements, and better security vulnerability
protection. For more details, see Bundled Third-Party Software.
o SECURITY VULENRABILITY UPDATES
Numerous major improvements to resolve any new security vulnerabilities.
o PRE UPGRADE REQUIREMENTS
- Successful cleanup of the repository See help.
- Successful upgrade to the previous major release, including the post upgrade manual migration of any single-model database import (deprecated in previous release, now not supported by still working) into multi-model database import. For more details, see the POST UPGRADE section of the previous version release notes.
- Successful update to the latest MIMM and MIMB cumulative patches for the previous major release for the main application server, as well as all metadata harvesting servers. Post patching best practice assumes:
· successfully re-harvesting (import) of all models (in order not to blame the new major release later),
· rebuilding of all configurations,
· deleting any unused versions of models and configurations (it may 3 days for the database to purge deleted models),
· and making sure that the database maintenance and search index are up to date.
- Successful database backup and restore of the MetaKarta repository using the actual database restore/backup technology (do not use the MetaKarta application backup).
- Successful backup of the install data directory.
- Clean install of the latest build of the full MetaKarta in a new directory (do not reuse/overwrite the previous version install directory)
- Repository on PostgreSQL Database:
· If you were using the bundled
PostgreSQL database server which is only available Windows version of MetaKarta, then this
database first need to be upgraded to the new PostgreSQL 13.2 bundled with this
new version of MetaKarta:
(See Application Server
Upgrade > Reconfigure your MetaKarta Database
Server (ONLY if you are using the bundled PostgreSQL database on Windows).
- WARNING: Finally, it is highly recommended to first test in a completely separate QA environment configured as follows:
· Assuming all above steps have been performed in the production environment, make a full (dump) copy of the repository database instance into a new one (it is recommended to temporarily stop the application server and database server for that)
· Perform a clean install of the full MetaKarta software on an empty directory of the new QA machine.
· Copy the data directory from the MetaKarta production installation directory to the QA installation directory (this will avoid a full lucene re-indexing)
· Use the setup utility to point to the new QA repository database instance, and start the application server.
o POST UPGRADE ACTIONS
- Level 1 - Review the logs for any automatic upgrade migration errors
· MANAGE > System Log
⋅ should look as
follows in a typical successful upgrade migration:
MIRWEB_I0044 Starting database upgrade. Product version is
31.31.2 whereas database version is 30.22.2.
DBUTL_I0031 Updating database
from version 30.22.2
...
MIRWEB_S0005 Running operation: Upgrade data
mappings to create classifier links
MIRWEB_S0005 Running operation: Migrate
Custom Attributes to Metamodel
MIRWEB_S0005 Running operation: Migrate
Business Glossaries to Metamodel
MIRWEB_S0005 Running operation: Migrate
Term Classification links to Metamodel
MIRWEB_S0005 Running operation:
Migrate Hide Data property to Sensitivity Label
MIRWEB_S0005 Running
operation: Migrate steward of content with the 'Send email notification when an
import' option to watcher
...
SEARCH_I0005 Indexing model [463,1].
...
MIRWEB_S0101 Server is initialized.
⋅ For automated test purpose, the following messages can be searched in the log:
⋅ On a successful
start/upgrade of the server:
MIRWEB_S0101 Server
is initialized.
⋅ On a failure to
upgrade the server:
MIRWEB_F0003 Service
initialization error:
⋅ On a failure to
start the server:
MIRWEB_F0004 General error
during service initialization:
· MANAGE > Operations
provides the details for each migration operation (in case of errors on
above system log):
⋅ Upgrade data mappings to create classifier links
⋅ Migrate Custom Attributes to Metamodel
⋅ Migrate Business Glossaries to Metamodel
⋅ Migrate Term Classification links to Metamodel
⋅ Migrate Hide Data property to Sensitivity Label
⋅ Migrate steward of content with the 'Send email notification when an import' option to watcher
- Level 2 - Test Previous Release Basic Features (Object Search, Explore, Edit, Trace Lineage, etc.)
· Do not start yet any metadata harvesting (model import) and configuration build until you reach level 4 below. In fact, the upgrade process preemptively disables any automatic harvesting post upgrade (see MANAGE > Schedules).
· Make sure you read the above release notes on: NEW WORKSHEET ATTRIBUTES, and CHANGES FROM PREVIOUS VERSIONS.
- Level 3 - Test Previous Release Customization Extensions (MQL, Worksheets, Dashboard, Presentations, REST API)
· Once again, do not start yet any metadata harvesting (model import) and configuration build until you reach level 4 below.
· Make sure you read the above release notes on: NEW WORKSHEET ATTRIBUTES, IMPROVED METADATA QUERY LANGUAGE (MQL), IMPROVED REST API, and CHANGES FROM PREVIOUS VERSIONS.
- Level 4 - Test Metadata Harvesting (Model Import and Configuration Build)
· If you used remote harvesting
agents (servers), you must install new ones (based on this version) in a new
directory. You may also want to copy the data directory from the old install in
order to reuse the metadata cache for incremental harvesting.
If you need to
have both the previous and current version of the software to temporarily
coexist on the the same machine for testing (until moving to production), then
you must configure the new agents (servers) to run on separate ports (using the
setup utility), and update them accordingly on the main server (using MANAGE
> Servers).
· Do not create yet any new model to import with new parameters, instead:
⋅ It is recommended (but not mandatory) to perform a manual full import (no incremental harvesting), as this new version of the software may bring more detailed metadata. In any case, only import a few selected models to start with (and one at the time). We assume here (per above PRE UPGRADE REQUIREMENTS) that such models imported just fine (with the exact same parameters) in the previous version of the software.
⋅ A day Later, you can perform more manual full import (one at the time), this time checking the incremental harvesting works.
⋅ Another day later, you can finally re-enable the scheduled automatic import (metadata harvesting) of that model. Then repeat the above steps again for other models.
- Level 5 - Test New Release Features from above release notes
- Level 6 - Patch upgrades from earlier builds of this version
· "MANAGE > Secret Vaults"
capability has been enhanced and replaced by "MANAGE > Cloud Identity".
The migration should be seamless as the upgrade patch automatically migrates
any existing configuration settings for Amazon AWS, Google Cloud, or Microsoft
Azure, migrating it all from entries in MANAGE > Secret Vault to cloud
identities in MANAGE > Cloud Identity. The ability to use a Cloud Secret
Vault to externally store the bridge password parameter is preserved through the
migration. Any use of the above secrets vaults as a URL based password parameter
of a model import model is automatically detected upon the first (manual or
scheduled) import to automatically populate the cloud identity of that model
import.
With this improvement, there is now support for more than one cloud
identity per cloud technology. In addition, to support the more robust cloud
identity features, select import bridges will support more automatic cloud
identity-based authentication. In this way, the password parameter is no longer
mandatory for those bridges and the authentication may be based on the cloud
identity.