data mesh & data lake & data fabric

转自 : https://www.xenonstack.com/blog/data-mesh

How is Data Mesh different from Data Lake?

Given below are the differences between Data Mesh and Data Lake.

  • The data lake is a storage repository. That holds a vast amount of raw data in its native format. The hierarchical data warehouse stores data in files or folders. Whereas the data lake uses a flat architecture to store data.
  • The advantage of the data lake is that it is a Centralized, singular, schema-less data store with raw (as-is) data as well as massaged data.
  • The Mechanism for fast ingestion of data with appropriate latency
  • It helps to map data across various sources and give visibility and security to users
  • Catalog to find and retrieve data
  • Costing model of centralized service
  • Ability to manage security, permissions, and data masking
  • The main difference between data mesh and data lake is that data mesh is decentralized ownership in which domain teams usually consider their data a byproduct that they don't own because a data lake is centralized ownership of that raw data.

How is Data Mesh different from Data Fabric?

  • Data Fabric integrates data management across cloud and on-premises to accelerate digital transformation. It helps deliver consistent and integrated hybrid cloud data services that help data visibility and insights, data access and control, and data protection and security.
  • Data Fabric and Data Mesh's difference is that Data fabric allows clear access of data and sharing of data across distributed computing systems by means of a data management framework that is single, secured, and controlled.
  • But Data Mesh follows a metadata-driven approach and is a distributed data architecture supported by machine learning capabilities. Data Mesh is a tailor-made distributed ecosystem with reusable data services, a centralized governance policy, and dynamic data pipelines.

What are the Benefits Of Data Mesh?

  1. Data Mesh provides agility. In this, each node works independently. The node is containerized and can be deployed as soon as any changes are ready.
  2. Construct and deploy new nodes to the mesh, whenever new data arises. Many portals and teams can access the same node, allowing the organization to scale the data mesh. This way, data mesh provides scalability.
  3. Use data mesh under various circumstances, like connecting cloud applications to sensitive data that lives in a customer's on-premise or cloud environment. Use it while creating virtual data catalogs from various data sources. We need to create virtual data warehouses or data lakes for analytics and machine learning training without consolidating data into a single repository.

Conclusion

A data mesh allows the organization to escape the analytical and consumptive confines of monolithic data architectures and connects siloed data. To enable machine learning and automated analytics at scale. The data mesh allows the company to be data-driven and give up data lakes and data warehouses. It replaces them with the power of data access, control, and connectivity.

原文地址:https://www.cnblogs.com/rongfengliang/p/14999991.html