Data Mesh -
intro for absolute
beginners
Created for Udemy by George Smarts
What will you learn in this course?
Course Modules
Module 1 The Basics
Module 2 The 4 Key Principles
Module 3 The Data Mesh Architecture
Module 4 Data Mesh tools
Module 5 Implementation of Data Mesh
Module 7 Data Mesh Best Practices
Module 8 Case Studies
Module 9 Conclusion
Download the course
resources
Main Resource - PDF Presentation file with
all lessons
Other resources - additional files/resources
provided in various lessons
Module 1
The Basics
Let's undertand
what Data Mesh is!
What is Data Mesh?
Data Mesh is a new approach to data architecture
and governance that empowers cross-functional
teams to own and manage their own data domains
in a decentralized way, while also collaborating to
ensure data quality and consistency across the
organization.
Challenges of traditional data architecture
Challenge What is the Issue?
Can lead to bottlenecks and slow down data access
for teams across the organization. Delays can
Centralized Control
make it difficult to scale data usage throughout the
organization.
Monolith Data Architecture is difficult to change
and adapt. It can be difficult to create new data
Data Monoliths
products or services, small changes in one part of
the system may have company wide impact.
Traditional data architecture can lack clear
ownership and accountability for the data. This can
Ownership lead to data quality and consistency issues. Also
causes confusion on who is responsible for
maintaining and updating the data.
Due to speed of business and lack of adaptability
of the centralized data architecture, some teams
and departments will have their own data sources
Data Silos
and systems. This leads to difficult sharing
information and having a comprehensive view of
the business.
Source of Image - https://www.interviewbit.com/blog/data-warehouse-architecture/
How Data Mesh solves these challenges?
Challenge Resolution
Data ownership and management is distributed
across individual domain teams. This reduces the
Centralized Control
bottlenecks and allows faster data access and
more scalable data usage.
Data Mesh uses modular, loosely coupled data
Data Monoliths
systems that can be easily changed and adapted.
Each domain team is responsible for the quality
Ownership
and consistency of the data that they own.
Data Mesh encourages data sharing and
collaboration accross teams. Teams can use their
Data Silos
own data systems but need to adopt a mindset of
sharing the information using for example APIs.
Source of Image - https://www.eoda.de/wissen/blog/data-mesh/
Limitations of Data Mesh
Challenge What is the Issue?
Data Mesh introduces additional complexity to an
Complexity
organization's data infrastructure
For a succesful Data Mesh, teams will need to
Cultural Shift make significant cultural shift on how data is
owned and shared
Each domain team is responsible for the quality
Governance
and consistency of the data that they own.
Data Mesh encourages data sharing and
collaboration accross teams. Teams can use their
Tools
own data systems but need to adopt a mindset of
sharing the information using for example APIs.
Data Mesh requires technical expertise across all
Talent
of the organization domains
Source of Image - https://www.eoda.de/wissen/blog/data-mesh/
MODULE 2 -
THE 4 KEY
PRINCIPLES
Let's undertand
what Data Mesh
is really about!
Data Mesh
The 4 Key Principles
Domain Ownership Self-Serve Data Platform
Data as Product Federated Computational
Governance
Domain Ownership
Autonomy Accountability Scalability Integration
Each domain has Domain owners are Allow teams to easily Domain owners need to
autonomy over its data responsible and adapt to changing work together to ensure
and can make decision accountable for the quality, business needs. data products integrate
on how to collect, store , security, consistency of well with each other and
process and share it (as their data products. meet the overall
long as this is aligned with organization strategy and
the organizations's overall needs.
data mesh principles and
guidelines).
Presentation by George Smarts for Udemy.com
Data as a Product
Data is first- Ownership and Self-serve Quality &
class citizen responsibility Consistency
Data products should be
Just like software Each data product should Data products should be
designed to be easily
products, data should be have a dedicated team built using standardized
discoverable and
treated as a valuable responsible for building, data models, definitions,
consumable by their
asset that is developed, maintaining, and delivering and quality requirements,
customers without
tested, and delivered to that product to its and tested rigorously to
requiring significant
customers (i.e., other customers. ensure their quality,
support or intervention
teams within the reliability, and
from the team that built
organization). interoperability.
them.
Presentation by George Smarts for Udemy.com
Self-Serve Data
Platform
Data Product Data Access and Collaboration Data product
Catalog Governance Self-serve data platforms lifecycle
A self-serve data Self-serve data platforms
can also facilitate management
collaboration and
platform typically includes provide a way for teams to
knowledge sharing across A self-serve data
a data product catalog access and consume data
different teams and platform should provide
that lists all the available products without relying on a
domains. tools and processes for
data products produced central data team.
managing the lifecycle of
by different teams in the
data products, including
organization.
versioning, deprecation,
and retirement.
Presentation by George Smarts for Udemy.com
Federated
Computational
Governance Decentralized Distributed Trust Collaborative Continuous
Governance Decision-Making Improvement
Data Mesh relies on
Federated Computational In a Data Mesh environment, Federated Computational
distributed trust to ensure the
Governance ensures that decision-making is Governance promotes
integrity and accuracy of
each team has the collaborative and consensus- continuous improvement
data.
autonomy to make driven. by providing a feedback
decisions about the data loop for teams to learn
within their domain, while from each other and
still adhering to the overall adapt their governance
organizational policies and procedures
governance policies. over time.
Presentation by George Smarts for Udemy.com
MODULE 3 -
The Data Mesh
Architecture
How do we
actually
implement this?
Let's find out!
Source of Image: datamesh-architecture.com
MODULE 4 -
The Data Mesh tools
and technologies
How do we
actually
implement this?
Let's find out!
Data Catalog
A centralized repository that contains metadata about data assets across the organization and its
domains. It provides a way for domain teams to discover, understand, and use the data they
need for their business functions.
Data Storage
Centralized repository for storing and managing large volumes of structured and
unstructured data from the different domains. Usually a data lake or a data warehouse
is used.
Data Pipelines
ETL (extract, transform and load) data from source systems into the data lake or data
warehouse. It allows the domain teams to move their data from the source
systems/locations to the central repository.
Data Quality Management
Used to measure the quality of the domain teams' data and identify issues that need to be addressed.
Data Governance
To make sure that the data is managed in accordance with regulatory requirements and the pre-
agreed organizational policies. The Data Governance tool makes it easier for domain teams to
implement policies and standards for data management.
APIs and Service Mesh
To enable communication of data between the different domain teams.
Data visualization and reporting
These tools provide a way to present the domain data and make it an easy to consume product
by every single domain in the organization.
Collaboration and knowledge sharing
Provide a way for domain teams to share knowledge, information and best practices with each other.
Choosing the right tools
Step by Step
Step 1 Step 2 Step 3 Step 4 Step 5 Step 6
Define your Research on what is Evaluate Test Other Decide
requirements available considerations
*Check the additional external resources for more info and
inspiration
Module 5 -
Implementing
Data Mesh
How do we
actually
implement this?
Let's find out!
Implementing Data Mesh
Step by Step
Step 1 Step 2 Step 3 Step 4 Step 5 Step 6
Define your Define the Assign Domain Identify the data Establish the Data Establish the Data
Domains methodology and ownership products and Governance Mesh architecture
scope owners model and technologies
Step
0.5
Get Stakeholder
Buy-In
Implementing Data Mesh
Step by Step
Step 7 Step 8 Step 9 Step 10
Create the data Data Mesh Further roll out Continuous
platform Governance and and monitoring improvement
tracking success
Module 6 -
Data Mesh
Best Practices
Here are some
things to keep in
mind
6 DATA MESH BEST PRACTICES
01 Domain-driven design
Get leadership Invest in self-serve
06 02 data architecture
buy-in
Start small to Fully decentralize
05 03
test and expand data governance
Data Product
04
Thinking
Module 7 -
Case Studies
Here are some
things to keep in
mind
CASE STUDIES FOR FURTHER READING
01
02
03
04
THANK YOU