Why Does Data Mesh Exist

Zhamak covers this much better than we can in some of her videos. Her talks at Datanova (talk 1 | talk 2) cover this incredibly well. A short version is below.

Zhamak Dehghani – the creator of the data mesh concept – worked as a consultant with many companies on the operational side of engineering for years. She started to see a pattern where companies were investing more and more money into their data lakes but failing to see returns. Data swamp comes to mind.

So she dug deeper and found many issues that were common to almost all organizations – mainly that data quality was poor, time to adhoc analysis of data was abysmal if even possible – at large companies, it was usually 6+ months-, data engineers were badly overworked and owned all the data for the analytics plane in a large centralized data lake when those data engineers lacked the domain knowledge to make the data truly valuable, etc.

Zhamak knew that the issues with scale had been mostly…at least addressed if not totally solved on the operational side. So she decided it was well past time to apply the same approach from the operational side to the analytics plane.

The first step was to change the approach to data to treat it like a product / create data products. Treating data like an asset means hoarding it and amassing it for no real reason. Treating it as a product means honing it and making it useful and easily consumed.

The second piece was to do away with the hyper-centralized data lake where everything was just shoved into the lake and there were no SLAs or quality and it often lost its context. This is often referred to as the distributed architecture piece of data mesh.

So, that should help you understand the genesis of data mesh. Without that context, data mesh can seem like a solution in search of a problem because the problem is as vast as “the way we’ve handled data for analysis internally at companies is completely bonkers and needs a wholesale change”. Yes, it’s that big 🙂