dinsdag 31 oktober 2023

Defining a Data Mesh

Zhamak Dehgani cornered the concept of a data mesh in 2019. The data mesh is characterised by four important aspects:

  • Data is organised by business domain;
  • Data is packaged as a product, ready for consumption;
  • Governance is federated
  • A data mesh enables self-service data platforms.

Below is an example of a data mesh architecture. The HQ of a multinational food marketer is responsible for the global governance of customers (i.e. retailers and buying organisations), assets (but limited to the global manufacturing sites), products (i.e. the composition of global brands) and competences that are supposed to be present in all subsidiaries. 

The metamodels are governed at the HQ and data for the EMEA Branch are packaged with all the necessary metadata needed for EMEA Branch consumption. These data products are imported in the EMEA Data Mesh where they will be merged with EMEA level data on products (i.e. localised and local brands), local competences, local customer knowledge and local assets like vehicles, offices…

Example of a data mesh architecture, repackaging data from the HQ Domains into an EMEA branch package

The data producer’s domain knowledge and place in the organisation enables the domain experts to set data governance policies focused on business definitions, documentation, data quality, and data access, i.e. information security and privacy. This “data packaging” enables self-service use across an organisation.

This federated approach allows for more flexibility compared to central, monolithic systems. But this does not mean traditional storage systems, like data lakes or data warehouses cannot be used in a mesh. It just means that their use has shifted from a single, centralized data platform to multiple decentralized data repositories, connected via a conceptual layer and preferably governed via a powerful data catalogue.

The data mesh concept is easy to compare to microservices helping business audiences understand its use. As this distributed architecture is particularly helpful in scaling data needs across complex organizations like multinationals, government agencies and conglomerates, it is by no means a useful solution for SME or even larger companies that sell a limited range of products to a limited type of customers.

In the next blog Start with defining coherent business concepts we will illustrate a data governance process, typical for a data mesh architecture. 

dinsdag 24 oktober 2023

Why Data Governance is here to stay

More than a fairly stable Google Trend Index, proving that Data Governance issues won’t go away is the fact that “Johnny-come-lately-but-always-catches-up-in-the-end” Microsoft is seriously investing in its data governance software. After letting the playing field for innovators like Ataccama, Alation,  Alex Solutions and Collibra, Microsoft is ramping the functionality of its data catalogue product, Purview.


Google Trend Index "Data Governance"
Google Trend Index on "Data Governance"

The reason for this is twofold: the emerging multicloud architectures as well as the advent of the data mesh architecture driving new data ecosystems for complex data landscapes.

Without firm data governance processes and software supporting these processes, the return on information would produce negative figures.

In the next blog Defining a Data Mesh  I will define what a data mesh is about and in the following blog articles I will suggest a few measures needed to avoid data swamps. Stay tuned!