New Whitepaper! Getting Data Mesh Buy-in

Download Now!

Data ownership in data mesh

Every month I am honored to moderate a round table together with Karin Hakansson, Amy Raygada and Andrew Sharp. These round tables are hosted by the data mesh learning community and concern a data governance on data mesh topic.

The subject of February 2024 was data ownership, the open discussion was initiated through the following questions:

  • How do you assign data ownership and convince owners to truly take ownership?
  • What are the responsibilities of a data owner and how do you enforce them?
  • How do you view data ownership in the operational data plane versus the analytical data plane?

Analytical data ownership: the hard part

It seems that many companies, at least those looking at data mesh, have already completed traditional data governance projects. Most companies believe that you can quite easily assign ownership of source aligned data products to the owner of the operational source data.

I argue that it is not the case that most companies have already such an operational data ownership or data stewardship in place, but it does not make this connected data ownership less relevant. If you do not have any operational data ownership, then use the data mesh data ownership program as a trigger to install such data stewards.

At the other end of the spectrum, consumer oriented data products should be owned by the consumer. They are created for the sole purpose of supporting a use-case relevant to the consumer. Most likely data products from different domains, owned by different people, will be used and joined. As it no longer makes sense to assign ownership based on the source, as there are multiple, we all agree that the main consumer should take ownership.

Different layers of data products

Different layers of data products

But not everything catalogs as a source or consumer oriented data product. There are a bunch of data products, somewhere lingering in between. Take the most horrendous example: the customer 360 view. Such a customer 360 view will probably contain data from close to all your domains and will be used in multiple domains. This is where you should become pragmatic. The following guidelines might help you:

  • Does a data product only adds a few fields from one or more other domains to an existing data product? Consider keeping the ownership with the person owning the first data product.
  • Does a data product combines data from multiple domains, for multiple purposes? Consider creating a new fake domain. Yes, a customer 360 is not a domain. But creating a new team, owning it, won’t kill you.

Owners only hear half your explanation

During my own data mesh journey, we went to domain teams with the explanation: “As of now, we will place people with data capabilities in your team. This will allow you to create your own data products and fasten your own data roadmap. You do, however, need to adhere to some rules which are there to prevent data and logic duplication and which should increase the quality of data.” Fair to say: everything after “however” was promptly forgotten, yet the possibility to complete your own data roadmap made their eyes sparkle.

Only half the story is heard

Only half the story is heard — Photo by Julien Maculan on Unsplash

So how do you convince data owners to take on their responsibilities and what are those? Cataloging your data, updating metadata, monitoring and improving data quality, managing access, … Those are all responsibilities of the data owner. Note that the data owner is not necessarily a technical person. So if you want to empower him or her to take on these responsibilies, they should become easy. Your data owner is a key user persona of your data platform, which you should keep in mind when designing your user experience.

Making it easy is not enough: owning data is not their only task. Where possible, responsibilities should be automated. Certain metadata can be maintained in code and as such disclosed in your data catalog for example.

But still: easy and automated is not enough. If people don’t understand the value of what they are doing, at a certain moment they will stop doing or at least experience it as an immense burden. You might succeed running as police behind them, but it won’t make them happy. Explaining the value from data ownership will always contain the following words: improved data quality, trustworthy data, guaranteed data availability.

Your data owner is a key user persona of your data platform, which you should keep in mind when designing your user experience.

How do you enforce them technically? We agreed on having technology session in one of the next round tables.

What about the operational plane?

We already did agree that source aligned data products and operational data should be owned by the same person. Yet data ownership in the operational plane and in the analytical plane are not identical. Analytical data reflects the operational plane. Data quality issues for example identified in the analytical plane, should be resolved in the operational plane. Also when they are only identified upstream by a data owner of depending data products.

You can’t avoid a crash from happening when you are not involved — Photo by Abed Ismail on Unsplash

Note: if data quality loss was introduced by faulty business logic in data transformation, the responsibility of course still remains with the data transformation owner in the analytical plane.

Data ownership, the outcome

To summarize: data ownership is a hard topic. It’s not only difficult to assign data ownership (hello aggregated data products in the middle), but also to convince owners to take on their responsibilities. You can ease the life of data owners through technology. But it all starts with charismatich leadership with a vision: only when you can explain what’s in it, both for your company as for the individual users, you are on a path to success.

Ways to Participate

Check out our Meetup page to catch an upcoming event. Let us know if you’re interested in sharing a case study or use case with the community. Data Mesh Learning Community Resources