The data mesh paradigm shift is a journey. A journey that begins with a purpose rooted in delivering data-driven value at scale. Not being focused on the purpose is the most common antipattern. During these four years as a PM Data Platform in Adevinta Spain, I have been actively involved in accelerating the technology to prepare the company to achieve its goals.
The data platform has been viewed as an expense rather than an investment for years. It’s time for a shift in perspective. Now, we must prioritize unlocking the data platform’s ROI. It is essential to discuss at the same table as business.
Speakers:
- Marta Diaz – Product Manager Data Platform, Adevinta
Watch the Replay
Read the Transcript
Speaker 1: 00:00 Want to welcome Marta Diaz who will be talking about trying to explain the ROI of implementing data mesh. And just to let you know if you have questions you can put them in the chat and we’ll relay them to Marta and then she’ll answer those at the end of the session. So yeah, well take it away. Marta,
Speaker 2: 00:20 Thank you so much Paul. Well first of all, I wish share my screen okay. It’s going well. Hello everybody. Thank you so much for being here. I do appreciate the invitation for taking part of this data managed learning community. So thank you so much Paul for the invitation and as I said from Paul, I will be talking about the data measure journey, our experience at Spain and very important that we’re trying because I would like to discuss with all of you how to talk the language of business in terms of data, my journey, how can we talk about the of investment on all of it? Well, and first of all I would like to present myself, I am Mar Diaz, I’m based in Barcelona, Spain and my background is on big data and BI also statistics, innovation and design thinking. I’m certified also in data management fundamentals and data governance.
01:28 And during my experience inside the data world, I have been the privilege of participating in a disruptive way in a very important change management inside a company inside which where I was participating on launching the data, the starting point of a datadriven culture being the first data product owner there leading the first initiative as quick wing for talking about data driven and make decision with data and be sure that the business have the buying. So very used to talk about of investment because it’s what very important to start the journey. And that was there also the data governance lead because every data product have to start with it. So it was something needed. And now I’m working as a product manager inside data platform. Very different from starting point as a data driven quarter because at today is a datadriven mature company digital first. And here I’m working from the dark size of the data platform, bringing services self serve platform to the users for make decisions with data for build this magic data products and how as a data platform help this to happen.
03:02 And while I’m also involved in different initiatives like other associations like data management association in the Spanish chapter and also I’m lecture from different universities and well let’s start our presentation. First of all, I will make a little introduction about Spain, the company where I have the experience doing this data me journey. I will be very focused on why our data me journey start because I believe that this is the most important thing to start to think about why we start this journey because this can bring us a focus on calculate this letter of investment, this value for the company. Then I’ll explain our data, my journey from a data platform perspective, the data products platform journey. Then I will try to talk this language of business talking about director of investment, not only a data platform as a cost but also as an investment.
04:07 And finally and finally the conclusion. Let’s start with a Vinta. Spain is an unlike classifiers with presence around the world in 10 countries when well-known brands of in Italy, plain bil in Germany, level one in France, in Spain, we have Mil and photo Casa and being focused in Spain that it will be the experience I will share with you in Spain, we are the largest online classifier group. One out of two internet users in Spain connect to our platforms. We have a large amount of data and as I said this digital first company and we have different marketplace and different domains we have to bring service. This maturity inside the data driven culture is inside every domain, inside every marketplace. And we as our data platform have to bring this service to them. And let’s start with the why way we start this data measure.
05:11 Well for talking about the why, I will start talking about different personas on what are they. So let’s start. The first point is the product. The different developers we have in each marketplace that are providing data to the analytics part. And what happens here is that they are implementing in that way new technologies like from monolith architecture to microservice and that means another way of consuming data, not national database were events. So it was needed a way to scale how to ingest this data and which service can bring the data platform to them for make this inges sell self and not try to find every marketplace or every domain the way of ingest this data. Then from a legal perspective, from the CFO and compliance, they want to reduce data managed risk at the scale they want to be compliant, they want to ensure the data security because as lot of inside the European regulation it is the GDPR and we are talking about a great amount of millions that will be five.
06:36 And another reason why we start this data journey also in terms of a scale was from a video data manager or data team perspective, they want to scale, they want to create new data products, they want to be autonomous in a cost efficient way and they have another handicap also that it is the ownership, how the data is ingested from the operational world and how some changes from this size make that the data pipeline. So you have to solve inside your data platform know how to put this ownership on the producers for being sure that you have the correct data in every step and you don’t have this reward how to scale this part. And also well as we are a data driven mature company, we have different data engineers inside different domains but this is not enough for the needs. So we have another bottleneck that are the different data engineers inside the different domains. So you have to scale this autonomy for other profiles like analysts or why not product owners. Another persona that why this data mesh journey start is the product business that want to make decisions in tied to market.
07:59 They want the data now and they want to know what is happening and they want to increase trust and understanding in data to make decision how to scale this part. And finally we have our ze that’s very important not only from a data perspective but also in every investment that they want to talk about profit and loss, to talk about of investment and want to to talk about the cost and revenue. How can we share for example that when we start journey or a data driven journey, we’re a complete of investment. So for that reason it starts our data journey because we want to scale all these concerns and all these problems and we want to solve in a way that we were very focused that was from a data map and manages perspective. So we will go one per one from, as I said, from a product perspective, the different developers.
09:09 We start with a monolithic architecture microservice and how can bring this service to the different marketplaces, to the different domains because they are thinking about the latest solution but it impacts in the way the data is consuming. So how can this data platform scale the way to ingest this data and to consume this data from a security perspective? As I say, they are thinking about how to minimize the data management base and here appears some questions and the security legal department is doing how much personal data is inside our data platform and who can access it? Well I can say to him, well you have to ask the different owners that are accountable of having this compliance or maybe you can ask the data platform with this control plane and can bring you all this information. Another question is do we have personal tion, right? Guarantees I’m sure that this is happening and I have to ask one per one, how can scale to have this answer as accurate as possible in the moment I need it and how can I manage and audit what is happening with this data?
10:39 The next profile is the CDO and data five that as I say, they want to time to market for solving problems. They want autonomy and well I put here they want governance but because this is not happening also they are not always asking about governance but are worried about our governance. But of all the one that is worried about it is the legal department that wants that the risk about data management is happening. And then finally the desire outcome is to create valuable data products and business solution at the scale. So the questions that they are asking is how can we be autonomous from the data engineer bottleneck for new development? As I said, it existed from data engineers inside every domain, but when it appears a new development, a new feature, how can I be autonomous at the scale and be sure that this is entitled to market or how can data integration be accelerated?
11:44 How can I scale the way I’m ingesting this data? And I put here also know how can we be compliant and not to try to, because as most of you know the data profiles already focused on making solutions or answer questions with business and they want to be compliant also, but they want that be as easy as possible, as easy scalable as possible. And one very important thing is how to be proactive and how about the producers ownership Because what was happening at the beginning is that we ingest data then once it’s ingested we analyze if it fits for purpose or not. And maybe if there are some change inside the source, a dashboard is broken, an ML model is working and we have to solve this change inside the data pilot in our data platform and it doesn’t scale and it’s not proactive, it’s totally reactive because we realize when something is broken from a business perspective, they want to reduce risk in making decisions and they want show me the data now they want the data and they want as a desire increase, starts understanding, find the data they need trust to need for making data decisions.
13:20 And the questions they are doing is, I like to know if my new campaign is working or not. Where is the data? This data is discoverable, I can trust on it. Does this data fit for purpose? I’m undoing my calculation, right? I have one source or two or I’m recalculating once again the same KPI. And finally the ZE zero is not show me the data he is show me the money, where is my money? This profit and loss. And they won a return of investment on that initiative and of course on data mesh for all the changement management that put inside a company for this scale. So the question that makes the ze is why do I have to invest in data infrastructure and data governance And of course as a PM of data platform, I want to answer and this question, another good question is what does it been in terms of money to be data driven and what is the return of investment of our data investment?
14:32 All of these questions is the reason why we started a damage journey and now I will focus on our data manage journey in terms of data platform. What are the different changes that happens during the last years? So first of all, I would like to start with before what I call 2022 that it was an era with called infrastructure. When we were very focused on a reactive way, we were focused on the different domains, the different teams bringing Infras to them with a building block that was this domain, this team and bringing all the capabilities in terms of terms of storage, scheduling, cluster provision, et cetera. So our focus was to have this cost of infrastructure splitted between the teams that were consuming this infrastructure and we have a self self infrastructure. But what happens is that there were different experience. So every domain could have a different wave of create data products and this pipeline.
15:48 And also what happens is that the accountability of teams when the legal department asks where are the, the accountability came only from the teams perspective, from the domain perspective because our data platform was very focused on giving the fracture, giving the solution, but it’s up to you to use it or not to use it or the way to use it. So you bring these cells there but different ways of using and this of course maybe it’s a lot of autonomy but maybe it is not as scalable as possible. And during this era our personas or our users inside data platform were 10% data engineers because it was very difficult to make sell, serve for other profiles we were talking about in product. So we was neither a technical knowledge for use our products, not for use our sales platform. So during 20 and 21 our 10% of users were data engineers.
16:58 Then after 2022 we were focused not only on the infrastructure plane but also on this data product experience. So it was the era what we call the data platform as a product era. Because we were not focused on the infrastructure, we were focused on the experience where the building block was not to have the infrastructure infrastructure for the different domain teams. Once the building block was the data product, how to accelerate the creation of valuable data product, discoverable, trustworthy with all the principles with the data product. We were not talking about cost per team, we were talking about cost and user for data products because as over zero says they want a return of investment of what we are doing, they one return of investment of data pool. So it’s needed to talk about the cost and talk also about the usage. And here the self serve is only the infras tool, it’s the experience itself.
18:04 It is the way we are using is how to put the computational governance inside the security by design to be sure that the requirements from a legal perspective are happening here and also the requirements for big trust from our business side. And here the accountability is not only from the teams initiated with this experience that put the different process had to happen for this computational governance inside the process and how to be by design for compliance guarantees. So with this we can answer the legal department with this control plane what is happening inside and be sure that by design we are applying the requirements legal ask us to do. And as you can see what we pass from a reactive to a proactive because we are very focused on the value on the return of investment on the data product side. And what happens also is with this data product experience playing, we change totally the personas, the users of our data platform because our users were not only data engineers, were also data teams, data analysts and data science.
19:32 And we start 2023 also with non data profiles, with this data product experience. We have here product owners, we have also developers that are accountable for this ownership. We have also business that want to answer some question and want to explore the data also and how with this framework, with this data product experience, they are sure what they are consuming and it escapes the data is near from them. So this is our journey, our full journey here. And this journey means that at the beginning we were very focused on the data infrastructure plane from a database perspective without all this infrastructure. Then after 2022 we were focused on the data product development experience plane for being sure that you have autonomy, you are self cell, but data scaled in terms of security, in terms of compliance. And we also start the ME supervision plan for answer the question that our legal department and our CEO asked us to answer.
20:48 Now how many I, is there a ation, right? How can we control what is happening inside our data platform with this building block that does the different data approach and how can we bring also information to our CEO which is the cost? Which is the value of this data product because the data platform exists because of this valuable data product. So until 2022 our conversation we’re talking about infrastructure, we’re talking about in our case a flow as a scheduler, S3 Redshift, Databricks. So the different infras we have there for creating these data sets, creating, analyzing this data and so on. So the different users were taking the source, getting the Salesforce infras and creating this dataset what the data engineers as I said before and what happens after 2022. Well we were focused at on the data product development experience and that means that we were not talking about infrastructure restore, we were talking about this building blocks that is called data product.
22:02 So we were talking about source aligned data product, we were talking about aggregated data product and this consumer aligned data product and how this data product development experience makes this abstraction of this data infrastructure, the infrastructure layer. And we here bring a framework for discovery, what we call a data product lab where all the users we are talking about pos, we are talking about analysts and business people can discover, explore the different data products, they can do some transformation during certain things and they can prototype what they exactly need or making a fast query for answer some question. And then inside the data product development experience, we have what we call, sorry, the data product builders that are different frameworks that bring us this abstraction of infrastructure and put the governance display, make sure that all the metadata is collected, be sure that we have the observability, be sure that this metadata is informed to the data a catalog for being sure that this is discovered, understandable. And putting different governance by design like the minimization of personal data, being sure that this has an analytical purpose at the beginning, putting this governance to the left. And on this term we have different solutions. We have the data ingestion for creating a source nine data product, aggregated data product and also data consuming suite that is more focused on probably on analytics part where they can create these data products to connect to a dashboard and then create lovely visualization.
23:57 So this is still our data products platform journey because as most of you know this is a journey is an evident story and how can we pass talking from a cost from other measure and data platform as a cost to an investment. Well how to pass from this picture talking about the different technologies, the different infrastructure, the cost of everything as a data platform, as a service that is bringing this part. So we can talk a lot of a cost and now we talk kahan, you are an investment, you are not a cost and you are an investment. If you can answer or can help these personas to answer the challenge or the question that they were bringing before, no. Then why we start this data me journey. So from a product and test perspective that they want to scale from monolith to microservice, they want to sell cell platform.
25:04 One of the principles of data mesh from A CFO legal and compliant perspective. They want computational governance being security by design. This counter play to be sure what is happening inside your data platform. What is happening with all the data flow the users are leading inside from a ZDO data manager and data team perspective, you are an investment focus on data driven data ownership. So being sure that every data product has an ownership, putting the ownership forward also to the producers and organizing these domains inside your company also with a Celsius platform that can bring them the autonomy with this data product development experience with this governance by design and from a product business perspective as they want the trust discoverable, they want the data as a problem with all the principles inside for being this data trustable, discoverable and so on. And from Azure perspective, he wanted data monetization and this control plane that bring him information about the data products we are building, the cost, the value, what ML models we are creating with it, what recommended or which kind of information or which kind of decisions we are making, which is the most critical data inside our company.
26:40 So as you can see I’m talking about the four principles on the database. So the solution for not via cause and vs made an investment. You have very clear these principles and have this data mesh ship and talk about the platform and know a service as a product, as a product that is bringing capabilities for creating and for being sure that we are maximizing the value of the data with these data products. So I would like to share with you different learnings for how to go from a cost to an investment. First of all for things being accountable for teams, think about the cost and think about the value. The data platform must provide them observability on monitoring. They have to know who is using this data, they have to know which is the cost of everyone product, this infrastructure which is the use for creating product for the storage and so on.
27:45 So you have to pass from a cost a data platform cost every month with the storage number of data sets plus processing allows the low NPS the satisfaction to start thinking in another mindset, a data platform that driven with observability, explainability productivity that brings you the cost not per month of per team but also per product, not the storage per product. The percentage of data product you have inside your platform. You have data that you have data products, the efficiency cluster provision and availability because you are accountable of how efficient are the different data patterns you are accountable of using well your cluster provision and also you have to put data products, use all the observability, all the metadatas inside who is using the data and how. So you have to from act to think about what is inside your data platform. And here we have some examples that we have been implementing Spain, we have one dashboard for cost and we started with the teams they knowing exactly which was the cost in terms of infrastructure of them and trying to talk about the threshold for having different analysis what is happening and start to thinking about the cost of every job they were using and they start being accountable of their cost because it’s not for free.
29:29 So the platform has a cost and part of this a lot of investment is just this cost efficiency. And another part that was also important for us a data platform is bring also accelerative from a value perspective. So we have different dashboards. One of them is who is using the reusability of all the data produce and also data sets because it helps a lot of data with data portal, with a lot of data set that we have with no value for the company and also bring information about the first step of value is for to know which is the reusability, how many access you have or how many dashboards are connected with this data product of disease, the input data of an ML model or whatever. So we start with this part with knowing the number of access and we are working on different ways of analyzing also for example, who is using this data, which kind of queries are doing the different users because it helps a lot to the data product owners to make the next release or knowing how the users are using the data produce.
30:49 So first of all this part how to be accountable for this letter of investment. Well I need the observability and I need some data for knowing and make the good decisions. And here different kinds of analyzing this information. A number of users depends on during the time another learning for producers being owners. The problem with the bottleneck with how to solve the source problems or the changing inside the source, the data platform should help by providing producers and consumers with an automate solution for data. So first of all of course producers and consumer have to how to arrive to an agreement and they data platform how can break dense a tech execution that this is happening in automated way And we are talking here about data contracts about this agreement that has this tech solution and help the business and the analyst and the CDO on that trust and understanding.
32:03 So we are working on a circular communication with the producers and the consumer, have this data in agreement and the data platform enables a solution or put this agreement into a document, into JSON file. We can see here where we can see the location if it comes from which Kafka topic, what are the columns, the format of the different columns, who is the owner if something happened and if there’s any change on the source, there’s another version of the data contract, there’s an aller. So we have a proactive as we see before, we don’t want to be reactive, we want to be proactive. It’s the way in the moment something is broken, it appears an alert and the producer have to change and the producer has this ownership because the producer has this commitment as this data agreement with the consumer. And here the data platform is an enable.
33:07 So a change management also needs to happen. So we evolve now with a circular know that we know exactly which data is coming inside the data platform because we have this agreement, we have this data contract and if something is broken I have an alert so I know exactly if it fits for purpose or not. From before that we were ingesting BIPI or via a real event all the data and inside the data platform. Once inside the consumer and the data engineer analyze it for purpose or not and suddenly something was broken because the producer doesn’t know that we use this data, something is broken. The data engineer have to know exactly what is happening, what this data visualization tool is something broken and you have to investigate the consumer is there why is happening and you have to solve it during, so probably here the trust that we gain with the circular solution, it’s better than the other.
34:15 Another learning from a legal perspective, from legal requirement to legal partnership, what does it mean? What was very important for us is have the buy-in of legal. That means to share this governance by design for being sure that all the legal requirements were applied and also guarantee them the control plane. And this helps a lot inside our change management because suddenly our legal department we’re talking about the lovely data contracts that bring them to know what is inside data platform, if there’s any PII and ensuring that we are minimizing the ingestion or non of PII that has no analytical purpose, how can we minimize them? So we make this also with data contact and we have also a part that is called the clarity for creating this source aligned data product where the producer and the consumer have to put which are the PI files, the personal data information and if it’s needed for analytical purpose or not.
35:33 Because if it has not an analytical purpose, it will be not ingested inside the data platform. So we are minimizing this part and if it has an analytical purpose, the legal department has to say okay, this is okay the reason why you need, but maybe you can think about a plan for trying to minimize this part. But in every step with this framework, the legal department has this information and know if a new personal information is inside our data platform. And of course we put here also the ownership, the time to the retention period that it’s also a requirement from a legal perspective and so on. Another learning that was needed for the business and also for this data team empower software data product development at the scale with effortless governance. So be sure that you are creating a valuable data product but with the requirements of legal and with the governance for ensure that by design you have all these metadata for that you can discover this data, you can trust on this data and this data is execute.
36:56 So on the side we were talking about what I have been talking about the data product development experience and these different frameworks that bring not only the cell self infrastructure but also this experience with the computational governance and with the different process that made sure that this is a data product with data as a product capabilities and some examples we have some use cases that we were very focused always on value for the business. So we have some use cases that as the data engineer was a bottleneck, the data and the data science were getting the data from the data platform, creating a script on Python on their laptop, building a model, then bring it with an CSV to the final user. So well this is not scalable because it was manual and also it is not secure at all. But the problem here was well I need to solve this problem.
38:03 I want to scale but I need time to, I have to build it in time but with governance. So now we create this level of abstraction when we have access to the data, we have a template with a repository and template with that that makes the user being autonomous for creating this data product in an way that create also the observability and also before there’s a data owner that takes care for it and also a data engineer. But suddenly this data analyst and data science we can scale and create new data products and industrialize this part and without the bottleneck of the data engineer. Another use case that we have that probably this is a problem we have today. This bottleneck of the data engineers makes also that the logic or the creation of new KPIs happen inside our database solution, our tableau. That means that when a KPI have to change or some changes, the data analysts have to remember every dashboard where this calculation was done and this happened in different errors.
39:25 No in terms of trustworthy or the data, this was not scalable and it was berg and how when we can bring a framework, in this case it was based on DVT for create this reports this business logic inside, in this case, inside the data warehouse, inside Reshift for them this information, this data set connect with the dashboard and not to make this transformation in the different one. So we make ones and we are sure that we have only one source of two. Another example is the part of the data contrast that we have also this part of the ative. So and before we were ingesting all the data in once inside we were analyzing. So we were ingesting a lot of personal data that was not needed. And now as you see before the user, the consumer and the producer have to define it. This has analytical purpose or not for minimized once all it’s okay, the check of the data contract is okay, this is already for consumption as a social line data portal with observability and so on.
40:45 But if something is broken it appears an alert to the producer for solving this problem and also to the consumer for knowing that the producer have to solve something. So it’s very proactive as our users ask. Another learning that’s very important is that implemented a flow observability to understand consortium power and monetize effectively your data market. And this came from the ZO. So as a G model I would hear how much are we willing to pay to access amazing data products, no trustable data products. So this is a more dream and I put here an example of but because it’s true that we are working on it for having this grapho with all data products considering both cost and usage but also because I have been talking of in terms of number of users, no or usability of data product, what ML models are connecting with it, we have to incorporate also the contribution to the business and we know we have a recommended that it’s bringing the 30% of our leads, which data sets are the input data, which data is the input data for this recommender?
42:12 Because the creation of the data product itself, it has value itself because this ability as this person types of data time, the data science is not preparing the data is just consuming. It’s very it this of investment for the company or another use case that’s next to call increasing the conversion rate of the A system that is reducing the 80% of the user requests. So we have to know the ux which are the data products that are contributing and as much cases you are contributing are more valuable, more valuable you are for the company. So how much having this marketplace of data product, how much you will pay for this data product for us the input data for your data for us have this input data for make decision with this dashboard with D model on whatever. And finally the conclusion with trying to talk the languages business is that for calculate data measure of investment, you have to start thinking about why you start this data, my journey, the different personas or the different things you want to scale and to solve and this control plane, these different data product development experience, these data products and your data products platform.
43:40 You have to understand it as also as a provider of this observability and monitoring for being sure that you can give a number and you can give also data being that even or answer the different question of the different personas for being sure that you are not only cost, you are also an investment. And that’s my presentation. Thank you very much all of you. I hope you enjoy and well I would love to change your point of view and your challenge, the dialogue and here we are trying to calculate a lot of investment, all this business more near than 10 years ago, but they’re still working of course.
Speaker 1: 44:30 Yeah. Well that seems to be a common trend. Yeah. Yeah. Thank you so much Marta. Well we do have one question from Stefan who asked how do you ensure that the data product platform is suitable for all your marketplaces and regions? And then linked to this, do you see trade-off between flexibility and standardization?
Speaker 2: 44:55 Well how to be sure it’s just that the finding this data flow development experience intends that which are the value you are bringing into them because of course you have to work on the change management. So before 2022, as I said, there were different ways of creating data products and we start to create a unified way, an experience to create data products. So it’s a journey of course it is something that you have to be very focused on, the most valuable data you have inside and having this observability, trustworthy, all this part well because the different marketplace say okay, I want Alex for the insured that the governance is put to the left. I want also that the producers are owners. So instead that you have to demonstrate with data, which is your value for them, get the buy-in. And of course it helps a lot also having the legal department with you as I said because the question about the legal department, about how many personal data you have inside the data platform.
46:08 I say, well the ones that are using my data development experience, I know exactly, but I don’t know what the ones that are not using. You have to ask them. So the legal department start saying, hey, I want to know you have the question a quick question because if not, there’s a framework that can help you. So it’s just not to trying to push but trying to put the value you are bringing with data product development experience. Because it’s not only a change the way you are doing, it’s because it’s needed for having this observability, this data management, this minimization of personal data. Well it’s a journey of course. And at the beginning of course you have to start with the quick wins with the marketplaces that once to go more indeed that wants to trust on their data. And for that reason we start with something like the analyst that were creating some logical inside the tableau.
47:16 Probably this is not the most valuable, but for them it was very valuable and they were very grateful. And one thing that we cover also is that they come to our demos for present our solutions because they see the value probably from a data engineer perspective it’s not so easy because they have a way of doing and you have to work more the legal department where you have the BII, but with an analyst that once did this autonomy, scalable, these use cases help a lot to understand the value because suddenly we accelerate business problem that were not solved at the scale. And sorry, because I think it was another question.
Speaker 1: 48:09 Oh sure. Well I wanted to follow up on that. In terms of getting buy-in, is it more challenging to get buy-in at the C level or at the data engineering level or both?
Speaker 2: 48:26 As always you have to bring data in. So the most important thing is not to speak with feelings. I’m feeling that the data platform is not governed. I feeling that it’s very difficult to create data product. So you have to bring data. For example, in our case the number of users of the different data sets we discover that data system data sets is a data platform then there has no owner or no users in the last one year. So suddenly it says, well if we continues like this, we will create garbage inside our data platform. And that doesn’t make change. How can you bring us a data, platforming data, this data for talk not with feelings. So what can we solve? Well we can solve the 30% of the data sets we are grading as not using or we are creating one data problem for one data product. So the availability here is very important. So I would say that trying to have as much observability as possible, what is happening inside and have this buy-in of the legal department and also why not the zero if the data product have this data product development experience, I can bring you the cost of it. I can bring you the rhetoric of investment of this part.
Speaker 1: 50:15 Thank you. Well, and there’s another follow-up question from Stefan. So how exactly is the ROI value calculated? Is the value calculated as a euro value? If so, how do you manage to measure indirect effects?
Speaker 2: 50:28 Well as every calculation we start with an MVP, the MVP is what we can achieve today, the quick win for calculate retro investment and in that case the most easy was to know how many access has this data product also as an MVP for this of investment. We can start with it and then incorporate, as I said, well let’s know which use cases we are creating with this. And finally we will like to talk about euros as you said. But we have to start with something. So not to be focused only on the euros, focus on the reusability because this brings value and focus on know what are the different users doing with your platform. So for us, we start with the number of reusability, no, we say the products, were using it and we’ll finish with the euro, but today we don’t have the arrow.
51:40 But what we are trying to achieve is just to know these limits and ask the final users the data science and this recommender, which is the revenue that it’s arriving for them, calculate this. And finally, I would like to put value to all our data products. And the data science says well I want to buy this data product for having this data marketplace, but today I don’t have the hero Stefan I have. The more near we have that is the number of users and different dashboards or when this is used. So something that is bring us the value in other terms. So I will say that it’s just to start and think about as much as possible on the next iteration for having this euro for them. For the ze.
Speaker 1: 52:49 Well thank you for answering that one. I have another question from at dta. Sorry, hopefully I pronounced that correctly. Can you recall what are the challenges you have faced when the paradigm shift happens towards data mesh and how you resolved?
Speaker 2: 53:05 Well as a data platform, probably the infrastructure technologies have to advance what is coming because if you start with the domains but your platform is not ready for it. So the most challenging how to accelerate the data platform capabilities for trying to balance also the change management instead of define the different domains that it doesn’t depend on the technological place, it takes place on another way and how to also evangelize with the governance. And for that reason we believe strongly believe how to solve the GDPR or the legal requirements to be best is to have the buy-in with the legal department. So the main challenge was to accelerate the technology to be prepared for what is coming and trying to balance the changes inside and well a lot of change management and we were trying to have as much partners as mass evangelists as possible.
54:21 So once a user was creating a data product, we say, okay, you have to come with me and explain it. And once the legal department have the buy-in, okay, please explain you, I don’t want to explain the lovely things a data contract can do for you, I want you to evangelize. So it’s not a work for one platform. One, it’s trying to get as much as possible evangelist because if not you will be fighting with everyone inside the company. And the change management never happens. No, because you have to start with these early adopters for trying to get more and more and very focused of course on valuable data products. What are the data critical and the main problems, which is the main problem for SVO, the data more relevant and if there’s any problem with data quality, how can we solve it with this framework, with this experience?
Speaker 1: 55:27 Yeah,
Speaker 2: 55:29 And well the challenge, it appears every day I have to tell, no it never ending a story, but well very focused on the observability I would say, and getting all the partners as possible or have this buy-in and not to talk alone and try to convince with data.
Speaker 1: 55:47 Yeah, that’s a really interesting perspective. You mentioned a couple times getting buy-in from the legal department, which I hadn’t heard before as the initial plan of attack, but that makes a lot of sense. So yeah, it’s an interesting perspective. Well we have another question from Mark. Sorry, said that, hope I put the accent correctly. So when I joined two and a half years ago as the data platform and governance director at Abe to Spain, so must be a colleague of yours, I told Marta that we would evolve to data mesh three years with varied initiatives. I want to ask her what she thinks has accelerated this transformation we have achieved at Vinta Spain.
Speaker 2: 56:31 Well, it’s true that when Mark arrived, atta Spain, he told me three years and I remember telling him no in three years, no, I am an action person and I want to, well of course there’s an MVP, we can do something, we can find the current use case we have to this volume. So we accomplished in two years I believe having all these data, product development experience, of course it’s needed change management in some ways, but having a complete vision of what we want to achieve and why we want to achieve these, were there and accelerating the transformation. Thinking back on the different challenge we have there and of course the different data engineers or the different person that was saying why we have to implement data contracts, it’s more easy for me to ingest directly the data.
57:32 And what’s very interesting to have the volumes of the developers and suddenly when they were talking together, the developer says, oh, you need this data. Maybe I can create you an event more near what you want. So it is not a conversations that before never happened. So I believe that of course we have accelerated transformation, but as I say, probably the technology have to think before because if you don’t have an implementation of data contract, it’s very difficult now that this conversation happened, this agreement, but how to be sure that something is broken or not. So asking Mark, I think that we achieved in less than three years, as you said. And of course thinking now all the changes that happens, I believe that the technology have used a lot and we have learned a lot about the first conception and then trying to make it happen with an MVP, with an iteration with the user, this is the most useful for you. So it was a hard work, but always together, the thing we achieve, and I’m more happy to, is that we create a team of database lovers now and this is incredible because they find the value, but you have to think about what you want to achieve with this data. How can I solve as a database initiative for this part?
Speaker 1: 59:07 Yeah. Alright, well we are at time for the remaining question. Maybe I can add those to Slack and maybe you can answer that last remaining questions. But yeah, thank you so much. Really appreciate your time. That was really interesting, a very unique perspective that I hadn’t heard before. So thank you again for joining us today and we’ll be posting this on YouTube so you can watch it again if you want to. And yeah, we’ll see you later.
Speaker 2: 59:42Yeah, thank you so
Speaker 1: 59:43Much. Yeah, thanks for joining everybody. Alright, have a good day. Thank you. Bye bye.
Data Mesh Learning Community Resources
- Engage with us on Slack
- Organize a local meetup
- Attend an upcoming event
- Join an end-user roundtable
- Help us showcase data mesh end-user journeys
- Sign up for our newsletter
- Become a community sponsor