New Whitepaper! Getting Data Mesh Buy-in

Download Now!

Data Without Borders

A summary of our Data Mesh Learning community roundtable discussion on August 31.

The Data Mesh Learning community hosted a roundtable discussion with data mesh practitioners to discuss how to manage data mesh architectures in international organizations where data is subject to local regulations.

Some of the questions posed during the discussion included: 

  • How are data products affected by local regulations?
  • Do countries with strict data privacy laws require local data products?
  • How do local regulations affect cross-country data aggregation?

Facilitators:

  • Jean-Georges Perrin, Senior Data, AI, and Software Consultant & President and Co-founder AIDA User group 
  • Scott Hirleman, Founder and CEO of Data Mesh Understanding 

Participants included: 

  • Paolo Platter, CTO and Co-founder at Agile Lab
  • Samia Rahman, Director of Enterprise Data Strategy & Governance at Seagen
  • Andrey Goloborodko, Data Engineer at Wrike 
  • Matt Harbert, Data Architect at Direct Supply
  • Eric Broda, Senior technology executive, delivery specialist, and architect 

Watch the Replay

Read the Transcript

Download the PDF or scroll to the bottom of this post

Ways to Participate

Check out our Meetup page to catch an upcoming event.  Let us know if you’re interested in sharing a case study or use case with the community. Data Mesh Learning Community resources

You can follow the conversation on LinkedIn and Twitter.

Transcript

Jean-Georges Perrin (00:02):Everybody. Hello? Hello. Hey Matt. Nice to see you again.

Scott Hirleman (00:18):Hello everybody. So J G P, as people are filtering in, did you want to kind of cover a little bit about what you were hoping to talk about today in the general discussion?

Jean-Georges Perrin (00:33):So I am allowing myself a weekend of vacation in Maine. Okay. So that’s why I’ve got the sweater, despite a very nice weather around. And we had to play a little bit with the sound because unfortunately it’s very, once again, it’s as I was traveling and what can we link data with traveling? So it was about traveling data and all sequence we have in this world of having G D P R, having C C P A and having all this consequence of data being linked sometime to the country where you actually are. So that was kind of a little bit the theme. I wanted to play the tourist and play with that part as well.

Scott Hirleman (01:24):Yeah. Well, and I think a lot of times people talk about this as data sovereignty, but I think there’s multiple layers to this, but if anybody’s familiar with German data rules, anything about any German citizen literally cannot go outside of Germany. And so within data mesh, sometimes you end up having to do your domains specific to countries simply because of the legal structure. Or you have to just go, okay, we’re going to have, is that at the architectural? Is that at the team level? Is that at the whatever? And how can we actually have queries that can leverage data without exfiltrating data or how do we think about that? And this is something that comes up in a lot of my one-on-one discussions, but it’s not something that comes up in public discussions. I haven’t heard anybody have good answers. J G P, do you have good answers on how to think about this, how to do this?

Jean-Georges Perrin (02:29):I think I always have good answers. It might not be practical, but whatever. No, so what I think is that first it’s something you will have as a data contract level. You specify the rule as a policy, as a data contract level. So your data product is aware of that. But I was exposed to a case where there was some European data and we wanted the European data to come to the US and I think that what I was thinking and suggesting in this situation as we were architecting that was to really think that, okay, we have the data, we have the data in Europe, we want the data in the us let’s have the same model, the same data product in Europe and in the us but we just have a different contract. So for what we would be consuming from the different product, because you still want the data, you want the anonymized data or you want some kind of statistical data or you want aggregation of the data. So that was what I had in mind.

Scott Hirleman (03:53):Yeah. Well, and so when you’re thinking about this, again, do you think of this as we need to enable queries to pull the data or do we need to enable the consumption of data that is in accordance with legal laws but that we’re not? I mean are you still moving data or are you still leaving where it is and how do you think about that kind of different concept? Because there’s so much nuance to this because this is such a difficult thing and all the laws keep changing. Right. And Paolo, I’m sure we’re going to get some feedback from you because I’m sure you have to deal with this with so many more European customers as well.

Jean-Georges Perrin (04:41):Hey guys.

Scott Hirleman (04:43):Hey Carlos.

Paolo Platter (04:46):Hi to everybody.

Scott Hirleman (04:48):So Pao, I would love to hear how you are specifically dealing with data sovereignty challenges or how do you think about setting up those queries? Because is it that you have a copy of the data product from one country to another? Is it that you have a copy of the model so people can understand the data and then they consume it where it lies and they’re not moving it around? Or how do you deal with those restrictions around data movement when you’re still trying to get global answers?

Paolo Platter (05:21):The question is about data modeling. I mean if it’s worth it to model global data product versus region specific data products or is about security and how do we deal with proper masking or other stuff?

Scott Hirleman (05:41):Well and not moving the data. I think it’s just all of those things. But yeah, J G P J G P probably understands the question more. There is so much chaos in this question.

Jean-Georges Perrin (05:54):You’re right. There’s two things, technical modeling of the data product itself, but that’s also the of that which is regulation. So how do you combine in a way so because product, but then you still have to move the data or copy the data or make the data accessible to someone have access to it. For example, as was saying the German citizen data in Germany, it doesn’t have to move, but maybe the same data product a different port for a US global company that is accessing it or do it or do you model it differently? I think that’s a question.

Paolo Platter (06:54):Yeah, there are so many options. I deal with something similar in a banking environment, I guess with the Austrian regulation. So in Austria they are really, really strict about data protection and basically in that case also with China with a global enterprise where China’s totally different regulation about data. In those cases we typically opted for creating country specific data products. So it is easier to maintain ownership and apply the local regulation based also on the local knowledge because it’s very risky to provide the regional data to a global team and instruct them to manage those data in a certain way. Anyway, you are opening for breach and misinterpretation and other kind of stuff instead of local people. So typically have the expertise and the full understanding of the local regulation to manage it properly. And then once they manage the data in the local region, you can try to find a different way to share those data with the other countries or with the global or sometimes you have the holding that need to do some aggregated the reporting across multiple region.

(08:46):So then you can apply data protection address, applying a masking row filtering, column masking and so on. You can also, if the interface that you’re exposing has no capabilities for masking, like if you are offering SS three plane or G T S or something like that, it is very, very hard to apply proper security rules. And in that case, for example, we allow for data duplication. So if you need to expose data for data scientists in our raw format, in that case you can create another output port that has let’s say a less restricted profile to access. And those data can be accessed by data scientist in the global company or in other countries, or I guess there are two option. If the interface is able to protect the data with the raw filter and call masking, you can have one single output port and different profile to access the data. If the data interface has no such capabilities, the only way is that duplication and create multiple output ports. This is typically how we approach this stuff.
Scott Hirleman (10:28):So I mean it’s fascinating, but do you think of the data consumption being local or because how do you prevent that exfiltration when you do combination across multiple things? That’s where I’m still trying to figure out. It’s very confusing for me when somebody goes, okay, I’m going to pull from these four different data products and is it that you say, no, you can’t actually do that, you can’t have downstream consumption or that you have what a lot of people are calling recipes and so somebody would be able to do that, but it’s not a live data product and then the compute actually goes to that country or it just becomes this really difficult question to answer. Is it something that is super, super specific to every company and there isn’t something that’s happening everywhere? Yeah,

Paolo Platter (11:27):This is really depending on the technical interfaith that you’re exposing. So if you’re exposing data through some virtualization layer and so on, basically the computation of those data is happening close to the data itself and you don’t have such problem. If instead you’re exposing a technical interface that has no computation power like G T S or S three, obviously you have such problems because the client is in another region and the data is streamed in some way to the consumer. And in that case, I guess it’s impossible to do client side masking or filtering without risking some breach. So that’s why I was saying it’s really depend on the technical design that you are creating to expose your output court. In some cases it’s possible to protect locally, the data in some other case is not possible. So the only way is create a copy that is protected by default inside the same data product, so inside the same ownership perimeter and then let the consumer access only the portion of those data. The risk doing this is that you need to create several copies for different profiles and it’s becoming a mess to manage

Scott Hirleman (12:55):And expensive. Yeah. So Samya had your hand up for a while. I would love to hear kind of your thoughts on this too.

Samia Rahman (13:03):Yeah, I was just going to share that with the latest and greatest technology that’s out there, I don’t see replication happening as much because if we take Databricks or Lakehouse architecture not picking on any vendor or the data warehousing capabilities Snowflake offers, you always get compute with your storage. So any queries that I want to make in the US for example about aggregate spend reporting in the eu, I’m going to send the query to the compute layer that exists in the EU and I’ll get the results. And that to me is the consistent pattern I’ve seen in everything. Paolo you mentioned about access controls, masking, anonymization, et cetera, those are established in that local data products. So EU will set all those security permissions, but when I’m in the US trying to unify the global aggregate spend report, I’m only getting the relevant insights or data.

(14:20):I’m not looking at the granular data at the aggregate level or at the global level. I’m requesting, Hey, eu, can you give me your aggregate spend by region? So that responsibility lies with the EU data management team like Paolo was mentioning earlier. The other thing I think worth looking into, I was at the bio IT conference earlier this year, there’s a great presentation around federated AI learning. Healthcare has already proven out how you can have privacy by design in healthcare organizations and across the world and those organizations can share their localized learning model to create aggregate insights or aggregate trainings. So there is advancement in that perspective where I think replication has just not been there. I haven’t seen replication in a long time as being the norm. It’s really about that localized training or localized insight providing as an output so that your global entities can aggregate those things.

Scott Hirleman (15:33):Are you seeing that though, like Paolo was talking about an object store, right? I mean maybe you throw Athena or something on top of that when you have that object store, when you’re talking what data scientists want to do, like J G P, you were serving data scientists. So when you’re thinking about that, is it the same thing? Because what I was hearing from Paolo and what I get is sometimes this stuff is messy query and sometimes you have to go how I can’t necessarily have the compute be executed there if I don’t have compute there and if I don’t have that capability. So is that what you’re saying is you should, no matter what, you should be building that into the query layer that the execution is still going to happen locally, even if it has to spin up an e c two instance or if it has to do Athena or something locally on a w s, that’s an easy way to do it, but then it does become that the platform team has to do it, but that makes it, I guess a little bit easier on the data product teams.

Samia Rahman (16:39):Yeah, I believe, I’m not an expert on this, but I believe when you have microservices, right, when you do data collection in Germany, you’re going to have an endpoint just for Germany and that data will get localized. And now if you want to share any reads from there that microservices is hosted in Germany, anyone querying from the US will only get the filtered output. But the compute always lies in that local space. And I think that that is an established pattern. So it would be odd that there is no compute, you can’t just store data without compute, right? There has to be some kind of compute layer in those local regions

Scott Hirleman (17:22):Sometimes. But I’ve seen some companies do things in some crazy ways. So we’ve got three hands up, and by the way, anybody that wants to participate, feel free to do the little zoom hand up or even just kind of when there’s a lull. But Andre, you’ve been, you had your hand up the longest, so

Andrey Goloborodko (17:39):Thank you. Thank you. So I’m here, I’m asking the question because I don’t have very much experience with building this G D P R compliant environment. Basically what I’ve saw that companies just put all their analytical data in place with stricter regulations. So you just pull all the data out of us and put it in Europe, and that’s, it works for now, but if you do it correctly, just take a simple task. For example, you wanted to make a marketing email send and you want to serve to put this compute engines basically this some small microservices that actually send these letters. So you should have, instead of one simple microservice, we should have the microservice in each country, whereas your data is localized, so your data will be processed in the country where it’s stored. But in my perspective, it requires really, really complex or DevOps configuration. I don’t know even how you right now will do it in Kubernetes or some engine like that. So is this real? Is anyone really build that stuff? And how much effort did it talk?

Scott Hirleman (19:14):Yeah, I think it becomes kind of what Paulo said of it can also become a logistical nightmare to have a lot of this stuff. Then you do have all that stuff. So Paulo, you’ve had your hand up for a while, so I’d love to hear what you’re thinking there.

Paolo Platter (19:33):Yes, no, just want to precise one thing. So yeah, what Samia was saying is, right, if you’re able to bring all the users into the same technology into the same platform, so for example, snowflake or Databricks, whatever, but things are getting more complicated if you have more than one platform in the company. So maybe you have the data warehouse or the Lakehouse on Snowflake and you have data scientists working on Databricks and you need to access the data into Snowflake. So in that case, or you access through J D B C or O D B C connection that have very low performances are not okay, for example, for training machine learning models or you need to adopt some other strategy. So that’s why I was saying it really depends on the way you standardize the output part of your data product and the protocol that you put on top of them.

Scott Hirleman (20:55):Yeah, for sure. This is a challenge. I wish that there were magic wands and easy buttons for this, but Matt, you’ve had your hand up for a while.

Matt Harbert (21:07):Yeah, so I just had a clarifying question. I just wanted to try to get some context around it a little bit. My organization doesn’t deal tremendously with jurisdictions that are international, but I have worked in other organizations that do. And so as I think about data products and data products localized to different parts of the world, is it really more a case of we’re doing that for the purpose of trying to make sure that we’re keeping the data close to the jurisdictions where people are going to need to be able to work with that data? Or is it more a case of we’re trying to comply with local regulation? Or is it both
Scott Hirleman (21:46):From the conversation? I want to hear what everybody else has to say, but it’s a great question because the conversations I’m having is it’s much more driven by legal than it is by need because by business need. Because people are people and sometimes yes, the offerings are different in different countries and you want to look at them differently, but you also want to have global aggregates. And so it’s mostly driven by regulation from what I’ve seen. But I want to hear what other people have to say.
Speaker 7 (22:17):No, I think J G B has land. You go first or I can go,

Scott Hirleman (22:22):Well, you have your hand up first. Yeah,

Jean-Georges Perrin (22:23):So just

(22:26):A little bit towards what Matt asked as a question. I think it’s also, there’s definitely the regulation. So for example, as Scott was saying before, data about German cannot Germany or G as covering of Europe or C in California. So that’s kind of the regulation you can be fronted with, but you also have some kind of business needs. So for example, I don’t remember what we’re saying. I think it’s Pao was saying about, no, I don’t think it’s Pao about sending an email if you’re sending an email, I Andre. But anyway, if you make an emailing, you’ve got local rules, you’ve got, so the technology could be centralized. Okay, let’s imagine a multinational company. So the technology to send the email itself would be, I don’t know, in India let’s say. But the way to process the email to write it, this is often something which is localized to the country. You don’t speak to the French people like you speak to the Dutch people, okay, we like food, they have no taste, and Scott is experimenting that for us right now.

(23:58):But the thing is, but you can communicate with Italian about food as well. Okay, I’m just people there. But anyway, so really the thing is this is a business aspect of it, right? It’s a regulation, okay? Your email, you cannot bring my email to the us, but the targeting itself can be based on anonymous, let’s say white male between 20 and 29 or something like that. This is a rule that can be global applied locally, and the messaging can be designed locally as well. So the data doesn’t really leave the country, but it’s being exploited by a global policy eventually. So this is a kind of very complex situation you can have, okay, but now I’m also interested, okay, I want to reach out or I’ve got a budget to reach out for let’s say 500,000 people. How many of those 20 to 29 years old or 20 to 34 year old? Am I going to target with my budget of $500,000?

(25:18):I need to have this information also to make my marketing decision, for example. So this is where if your central marketing, let’s say is in c s or you’re targeting Dutch and French people, well, you still have to bring this data there, but you can bring it in an anonymous way. So this is where we need to understand what data is accessible individual, what they call I, so the private personally identifiable information, which you don’t want that to go out of the country, but the statistical information, 5,000, 24 year old, this you can bring or is the aggregate portion collectively this 24 year old spend a million dollar in my stores. So that’s why also larger companies with headquarters around the world for it want to be able to bring this analytic and they need to because of their reporting, because surely they usually publicly traded companies. So they need to be able to report this kind of information as part of their reporting to their shareholders. So it’s not about completely creating a firewall to the data, but a firewall to the p I I of the data. Okay. So I hope this clarifies a little bit

Speaker 7 (26:43):Question. Yeah, we only have two minutes, but J, G, P, so I think Scott, you were asking, I think Matt were asking, it’s all driven by business needs, I would say, and then you wrap the compliance and governance policies around it. I think that’s how I see it. I mean it’s hard time seeing UK personal data bringing into us for marketing, but j g, so I think that may not be the use cases, but I think at the same time, I think mostly finance use cases where you aggregating the finance information across your multiple countries and providing a single pane of glass to your leadership about the finances data, those type of use cases. I’ve seen it, but in those use cases, what we do work with compliance and already mask or do not even bring personal data to your US region. So I think those are the things that we take care of it.

Scott Hirleman (27:28):Yeah, I am seeing that a lot in the financial services space because of the, you just go, you don’t need this data or there’s no value to this data, or we’re going to make this super, super locked down. So you can do all of this stuff, you can move this data, you can bring the compute to it, you could do whatever. But if it’s p i, then it’s something that’s a little bit more custom. But the number of times when people think they need p i and they don’t, it is about 85 90% of the time from talking with a lot of people in the financial services space because as soon as compliance and legal start pushing back, they’re like, why do you need this? Why do you actually need this? And they’re like, well, we thought it might be interesting. And it’s like, okay, just use a customer id. It’s fine. You don’t need to be

Jean-Georges Perrin (28:20):Another sector. Where it’s really helpful to think about is risk because the criminals or the people, the black hat people, they don’t really care about orders. So the same individual that can be located, let’s say in Maine could be actually a bad operator acting like is in France or acting like is in the uk. So you want to be able to aggregate with data to say, well, this guy is a bad guy and is pretending to be the boost from both these countries. So there’s a lot of use cases for bringing data answer.

Scott Hirleman (29:11):I see you unmuted at this point. There’s 8 million different questions that we could ask or answer or anything like that. So I want to just let where everybody wants to take it. Go ahead, Paulo, I saw you unmuted, so I assume you’ve got something you want to add there.

Paolo Platter (29:28):No, I was thinking to another use case similar to the one just highlighted by Jean George. The other use case is an holding that is owning maybe two different companies and salespeople of a company can’t see the contact of salespeople in the other company because anyway, they are in competition, like, I dunno, two car makers, I have a holding owning two car makers. They don’t want to share numbers about sales. But anyway, the holding need to aggregate numbers to understand the trend of sales and so on. So it’s plenty of use case like this. Anyway, all these use case and patterns under the hood finish into row filter and column masking in terms of capability on the data.

Scott Hirleman (30:34):I’m still wondering, because this comes up a lot for me in my conversations, and I think it’s really important to this, how are you seeing people actually do cross data product queries, right? I mean, I think we’ve talked about that people are doing virtualization, but it’s not that people are actually computing the multiple across data product queries. How is that actually executed or how is that somebody’s having to write advanced queries? Is that something via the platform? I think that filters down into this and J G P I saw you just unmute as soon as they said that.

Jean-Georges Perrin (31:15):Isn’t that the whole idea of the mesh? Because the thing is, if you are just querying one data product, you’re not doing data mesh, right? You’re doing, I’m querying a data product. Okay,

Scott Hirleman (31:28):High quality data silos, high quality data silos, great, yay, we’ve got high quality data, but it’s data silos.

Jean-Georges Perrin (31:34):Yeah, exactly. Xavier,

Samia Rahman (31:39):Yeah, I was just going to say I’ve been dealing a lot with cross data queries or cross domain data products because of master data management. You have this core customer data, but it has to be reusable across many domains and then you stitch back all these data products from different lines of businesses to do that aggregate spend report, let’s say for that, to me at least, it works simply and beautifully with the data mesh architecture because your final fit for use data product, which is my ag spend data product, it’ll simply request access to the upstream data products. It’ll get the view or the data that it seeks and run the joins. Now it can be on different tech platforms, like two of my data products are in Databricks to an Azure synapse. I’m allowed to join it and get final results. So the localized queries run on those upstream data products when I fetch that data. But then the finalized join and the final aggregate view is on my aggregate spend data product. So to me, I find that question strange. I’ve been hearing it for over five years now because when you actually implement it, it does work. So I struggle with where that question usually comes from because we’ve done it with microservices as well. We do cross database joins to make aggregate fit free use applications happen.

Scott Hirleman (33:22):I think where the question comes from is how complex of a query do you have to write, right? If you have to write a complex query to be able to do that, then we’re still at a place where we don’t have self-serve data. And so that’s the real question that drives that because it’s like, yes, we can do this, but if it’s overly complex, nobody can actually, very few people can leverage it. So Eric, you’ve had your hand up for a while. Yeah.

Eric Broda (33:49):My experience actually is very different and maybe a function of, I do a lot of my work in financial services payments, but anything at scale, I’ve never seen anybody allow freeform federated queries in a database unless it’s been completely vetted, completely performant. What that really means is you don’t need S Q L. What it really means is you just use APIs, okay? Because the benefit of doing the cross, there’s two distinct disadvantages of doing that. If you have SS Q L, you’re kind of violating the whole notion of a boundary around your data product because now you have to know the internals. Worse is you’ve now bound to maybe three data products and when one of them changes, all three of them fail with that joined query. So my experience is we actually never ever allow join SS Q L based joins across data products. It is all vetted queries hidden behind APIs is the way that we end up doing. So, I mean, I know there’s a lot of folks that like Trino and all that stuff and the federated query, but nowhere have I ever seen that work successfully where you actually have it at scale and you actually want to apply real data mesh concepts around boundaries and loose coupling.

Scott Hirleman (35:15):For me, a lot of times it’s somebody that’s going around and trying to find some stuff.
Eric Broda (35:21):Sure. Again, I delineated very specifically exploratory work. Discovery work is a very different thing than when I’m running analytics in production where people really care about having the right data available all the time. I mean, there’s nothing worse having an executive say, I haven’t got my sales report because somebody made a booboo with some SS Q L changes and what have you. But discovery work that’s perfect for it, but not production analytics,

Scott Hirleman (35:50):But I think financial services, they know a little bit more of what they’re doing than a lot of the companies that are just trying to get there. So Paulo, you’ve had your hand up for a while. I don’t want to,

Paolo Platter (36:04):Yeah, I was just want to ask to Eric, what about self service, BI self service reporting needs? Because it’s a recurring topic when it comes out. Are you going to dismiss this dream or

Eric Broda (36:25):No? So the vast majority of cases do not need this. There are some where you do need, BI is a good example if I want to do sales across all of my regions and roll it up. What I’m saying though is that the only way we allow that to occur is through vetted queries, no freeform. And as much as we try and hide it behind APIs, even when we use say a Tableau or something like that, the queries that the execute are completely vetted. So there’s no freeform type stuff that we ever allow. And like I said, the default answer is it goes behind an a P I unless you really need it. And there are a few use

Paolo Platter (37:08):Cases. I am struggling a bit a P I. You mean

Eric Broda (37:14):Restful

Paolo Platter (37:16):Rest a P i going through H T T P and so on. What about when a report need to drill down, maybe a geographic map with million of points or something like that. Are you going to have good performances or then there is a secondary path for that?

Eric Broda (37:39):Well, for the most part, we haven’t found too many problems, to be honest with you, where you have specific tools, BI tools, that are engineered specifically for that. Again, the key here is vetted queries, no freeform stuff. And like I said, where possible we hide it behind APIs. It’s the general principle, I suppose, as opposed to there’s always going to be exceptions to the rule. The key though, I think is you don’t allow freeform out of the gate. You don’t make as much as possible. You try not to make the innards of a data product visible by default, but there’s exceptions and the default provides a way to alert you to where you actually do need to have the exceptions as opposed to having bad practices become the norm.

Scott Hirleman (38:24):And this is totally different. I think financial services versus everything else because what Eric’s saying is the thing that I constantly hear for financial services and everyone else is trying to figure out how to get people to be able to use data. So they lower their restrictions on that, which is risky. But yeah,

Eric Broda (38:46):So yes and no. So here’s the situation you want to avoid is somebody has this wonderful query. Unfortunately it was a select all from a 10 million row table and they blew their cloud budget because Snowflake makes it really, really easy to do that. The real issue though is it’s not just I I and all the rest of the stuff, but if you allow unbounded queries to anybody and everybody, you’re going to have a cloud bill that’s astronomical because I mean, we’ve all heard about the four hour query that the person didn’t know was going on, and that’s not the surprise you want. So like I said, I think there’s room for discovery on a very careful basis, but federated or open-ended queries leads to great harm.

Scott Hirleman (39:38):You’ve had your hand up for quite a while.

Samia Rahman (39:39):Oh yeah, I completely agree with you, Eric. The whole point of data mesh and data products is to bound that complexity and you are running production grade analytics or providing production grade usable, reusable data. So you have to make sure that they’re vetted, have the QC checks, it’s versioned and available for whoever wants to run it. The only thing when you say a P I as a guiding principle, I don’t think it’s always necessary. You can have a data set as a contract. So I can say this schema and this version of the data at this location, especially in a big warehouse for power BI developers, et cetera, is sufficient as long as it’s versioned and it’s modeled right in the fit for use, then we have the contact consistently.

Eric Broda (40:38):And I think you highlighted something that I forgot, but it’s so crucial is the whole versioning. Now, the reason a lot of folks use APIs, one of the reasons I use ’em anyway is because there’s a relatively easy to understand versioning mechanism, and it actually is somewhat enforced if you use things like open a p i specifications and such. The problem you have with data, especially as you expose the tables, the table structure, and yes, you can hide ’em a little bit under views, but these things change and there’s no reasonable standard way of doing versioning when your schemas change, at least at the

Jean-Georges Perrin (41:17):Database level,

Samia Rahman (41:19):I think it’s been solved.

Scott Hirleman (41:37):Yeah, we’re getting almost no things you’ve got to yell into the mic.

Jean-Georges Perrin (41:40):Okay, I’ve got to yell. Okay, that’s all the wind. Okay. But the thing is, and it’s a good time because I’m about, we think about wrapping up here and first, the thing is, I completely disagree with Samia Zaki and Eric and Eric, this is almost a deal breaker for our book. Okay, so be careful. C, it was to punch our book together Anyway, so I think this is a topic we will have to cover sometimes about the data consumption, but on the topic, on the versioning of data, but I think that I’d like to wrap up about data sovereignty. I think data mesh, from what I heard, it’s not impossible to do and that’s a good thing, okay? Right. We can’t do data sovereignty with data mesh. I think that what Parallel said was brilliant about having the country specific data products where the ownership is also linked to this data product because you’ve got local ownership in the country who know the regulation, who know the data, et cetera. And with an aim of limiting the data duplication. I don’t know if anyone understood anything I would just said, but I’m thanking you, everybody for joining. Scott, any closing word?

Scott Hirleman (43:15):90% of this stuff is completely over my head. So if anybody else that’s listening is also a little bit lost and drowning and stuff, as soon as we get into the overly technical, don’t feel bad about it. Sometimes we’re all, this is an area where I am still so confused and so stupid on. So if you don’t feel bad about it, it’s totally fine.

Jean-Georges Perrin (43:41):Alright. And see you next week, and I will not be in Maine anymore. Okay. Bye-Bye guys. See you.

Paolo Platter (43:51):Bye. Bye-Bye.

Ways to Participate

Check out our Meetup page to catch an upcoming event. Let us know if you’re interested in sharing a case study or use case with the community. Data Mesh Learning Community Resources