Skip to content

Data Mesh

  • strives to close the gap between operational data and analytical data by promoting Data as a Product instead of data being carried using ETL
  • Example
Traditional Analytical Data Data Mesh
Centralized Ownership & Governance Decentralized Ownership & Federated Governance
Monolithic Distributed
Pipeline as first-class concern Domain as first-class concern
Data as a by-product Data as a product
language: Ingesting Serving
language: Extract, load, onboard Discover, consume, link
language: data flows through pipelines publish data via ports
language: data lake/warehouse/platform ecosystem of data as products

Data Mesh Pillars

  1. Domain-driven Data Ownership and Decomposition Architecture Domains
    • Domains aligned with the origin of data: Facts & reality of business, immutable timed events, historical snapshots, changes less frequently, permanently captured
    • Domains aligned with the consumption: Fir for consuming, aggregation/projection/transformation, changes often, can be recreated
    • Domains that straddle the above two
  2. Domain Data Product: data as a product that has characteristics: shareable, discoverable, understandable, addressable, trustworthy, inter-operable, secure
    • it's an architectural quantum
    • provides historical and read-only access to data
    • consists of Input and Output Data Ports
  3. Self-serve Data Infra as a Platform
    • Autonomy: doesn't mean duplicating technical infrastructure or resources, abstract technical complexity in to self-serve data infrastructure using
      • blueprints, unified access patterns, discoverability, SLO and monitoring, pipeline orchestration, CI/CD, automating governance
  4. Federated Governance: build an ecosystem that governs all data products across all domains