Rethinking MLOps Observability with Bernease Herman

June 6, 2022

Simba Khadder

Episode

MLOps Weekly Podcast

Data Scientist, Why Labs

Simba Khadder

This week, our host, Simba Khadder, sits down with Bernease Herman from Why Labs to discuss the importance of observability in MLOps

‍

Listen on Spotify

‍

Simba: Hi, I'm Simba Khadder, and you're listening to the MLOps Weekly Podcast. Today. I'm chatting with Bernease Herman, a data scientist at WhyLabs who also holds a position at the university of Washington Science Institute as a research scientist while also completing her PhD. At WhyLabs, she uses aggregate statistics to collect data distributions for monitoring real-time machine learning systems. At UW, she does research on synthetic data and evaluation of machine learning models as well. Before joining WhyLabs, she works as a software developer on Amazon, working on models related to the inventory, planning and control there. Bernease, so excited to have you on the show today.

Bernease: Yes. Thank you for having me.

Simba: So, I like to start all these podcasts with the question, what does MLOps mean to you?

Bernease: Yeah, this was an interesting one. So simply put, I think MLOps is about how do we think about the maintenance and the reliability of machine learnings beyond the initial creation. I think in machine learning, especially as a researcher in machine learning, the more academic side. Academics certainly think about kind of the creation of the model. You know your tests on it, you get some performance numbers and then the model kind of ceases to exist. But when you move to production, you really need to think about how does this thing kind of, how is it robust and reliable and how does it last for a long time? So, I think of MLOps about that space. Of course, there's big parallels to DevOps, which is kind of a Port Manto of development and operations, and that's the same for the machine learning side of it.

Simba: Yeah. And actually, it's funny you bring up the DevOps things. I'm curious, like to expand on that more. What do you see as like kind of the parallels and the differences between MLOps and DevOps?

Bernease: Yeah. So, DevOps has a number of things, kind of continuous integration and deployment. You have things related to monitoring and kind of signals of service uptime and availability, that sort of stuff. And I think DevOps really has kind of two parts. One is these special tools to be able to continually have observability and kind of easy deployment of systems, but you also have just the organizational aspects of DevOps, right? If you have an organization that does that DevOps, you tend to think about the way software is developed very differently. I think Machine Learning Ops has all of those things, but I think that there are some additional requirements that you have to think about. So, one big one that we talk a lot at WhyLabs about, is that when you do DevOps, you normally can get away with really simple kind of metrics of things. So, let's say you have some service and it's running in production. You care about a number of metrics, like very many things related to you. The percentage of time of this service, the latency. You might even care about like the volume of data that comes into the service purely from a DevOps perspective.

And I think MLOps, the thing that gets added onto that is not only do you care about just those metrics, but you also care about the content of the data that you see. That's normally a thing you don't really have to care about for DevOps. You may log it. You may have some sense of invalid inputs, but that's kind of it for any machine learning data or just data in general. Not only do I care about that the data got there, I care about what's the distribution of that data; is the average value difference. So, there's all of this kind of within the data stuff that you have to add on for MLOps.

Simba: So, expanding on that a bit. We talk a lot about, I mean, there's a lot of categories in MLOps, there's a lot of vendors in MLOps. There's a lot of like random example solutions out there. I personally feel like there isn't yet example of like a canonical perfect MLOps workflow. Like I can't personally think of one company that’s just like, this is perfect. This is exactly how everyone should do it. And I don't even know if it really exists a thing as a perfect MLOps workflow, but if it did or what's like, I guess the closest you can get to, like, what does it look like when MLOps is done right?

Bernease: Yeah. That's a great question. I mean, I will definitely take the same side that you're on. I think it's a hard thing to be like, I'm going to take the good side. You defend the perfect workflow. I definitely don't think that's the case. I do think that when people kind of get toward MLOps, they have a specific vision of the system in their head. It's, you know, maybe it's their specific problem that they're working on. That makes total sense or just kind of the things that you hear over and over again. So maybe for example, if you're talking about feature stores, you often get into the space of like you have customer data, there's lots of different, maybe parts of the organization or website that give you different kind of aspects of the customer. You'll want to do some aggregation and have some sense of what's the current features for this customer and serve that. But I think that's one aspect of it. And that makes total sense in some of these cases, there's other aspects.

So, when I worked at Amazon, we were in the supply chain optimization team. And our job basically was to decide how much, if every ASIN that wasn't third party vendored; millions of ASINs. ASINs being just products or skews really, I think is the best equivalent for ASIN. But for every skew, we want to decide how much should we have in stock at any time? So, the problem is very different because I think unlike kind of customers or things like that, it was a lot more ephemeral. If I made a decision today about how much we should have in stock, I might buy some more in stock or decide to not buy some, but tomorrow I'm going to remake that decision. And I actually don't care about what I said yesterday. I just care about how much I have in stock. So, I think if you have a very ephemeral system, maybe the world looks a little different and you don't need to store as much information, you need to be able to rerun information. If you have a system that has kind of a fixed cohort of people or like customers or things you have to predict, then you're going to want to store more information. So, it really depends.

Simba: Yeah. It sounds like it's really dependent on the use case. I like to use the idea of self-driving cars versus just like fraud detection, where they're both ML, they both need MLOps, but the problem statement for a self-driving car model is for a self-driving car model, just so different. It sounds like a lot of what you're saying is also just like understand what your problem is and understand what parts are of the highest kind of value and just kind of focus on those first and go from there. Is that fair?

Bernease: Yeah, I think that's fair. And I think that also feeds into the different components and tools that make up your system.

Simba: So, a little, maybe a different tangent. I'm curious about your background. You came actually from more of a statistics and math background, you're doing your PhD. You know, you come from academia background as well. It's interesting with MLOps because it's kind of as funny you get people from both sides of it. Like I came, you know, I was an engineer before I came from a software engineering side, got into like product ML, like building recommended systems and then gone to MLOps. So, it's kind of like almost the other side. And so, it's always interesting for me to try to understand the difference in perspective and kind of the different strengths that come from each side. And I'm curious from your perspective, how does coming from a statistics and math background compare to coming from a CS and software engineering background for MLOps?

Bernease: Yeah. So, I’ll give a little background on kind of where I came from. So, I as a kid, even loved statistics and business. I played a lot of sports, but outside of that, I spent a lot of time on business ideas. So, when I was picking kind of my going to college and deciding where my career would go, the only thing I kind of knew based off of my family was being an actuary or an epidemiologist, someone who really does statistics. So, for those who haven't heard of an actuary, they kind of create statistical models to measure uncertainty often for insurance. But when I started that career, I realized that it's a very structured career. And so, I moved to kind of financial engineering and I worked at Morgan Stanley, a big investment bank on the trading floor and doing research. And both of those just felt very kind of structured as a career and not very entrepreneurial. You know, you work for a large company, you kind of get cold and handcuffed in a way, especially for finance. You make a lot of money, but your skills get more and more focused so you can't really apply them outside.

So, I applied as a senior in college to software jobs, and that was so hard. I think as a person with a stats and math background, at least at the time I took a number of programming classes. You kind of have to even as a stat major, but just the way of thinking about software engineering kind of goes well beyond being able to write some code. So, I remember being on the job at Amazon for the first time and there being a race condition, like I had no idea what a race condition is. I'm used to writing some code and then it running and then that being fine. I never thought about threading. I never thought about any of this stuff. And so, I think that part is certainly hard. Just kind of getting used to the way that you think about software when you have a formal background in that

I actually teach a class at UW that's called software design for data scientists, because I really believe that you have to understand kind of some software patterns, how code gets structured together to be a good data scientist, because you're going to be embedded on one of these teams. When someone asks you to write code, they're going to want it to work in a couple of months. So that means you have to actually think about how to structure it for long term. You know, it's not a notebook necessarily that you can just kind of rewrite when you need to change it. So that's a big thing; is just understanding the software engineering concepts that is difficult.

I think another thing that I really struggled with at Amazon and why I left Amazon initially to go to UW is that as a statistician, I had like a really strong sense of methodology. Like what is the right method for this use case? What assumptions are we making? And then that tells me what the valid things are I can do. And it's very different from kind of how machine learning works, very different from even normal software development, where it's much more about what is possible and what is reasonable. And let's build up to that maybe toward the right thing to do based on the kind of testing and having kind of empirically deciding that.

So, I think that mindset is certainly a shift, especially coming into like MLOps, because even in machine learning, you can be very theoretical and then it kind of find some performance metrics and be done. But the moment that a model's going to maintain kind of life and be in production for some time, all of this kind of computational complexity, the time requirements of getting to the perfect solution, those things will weigh in on you and you need to really learn that other perspective and just kind of deal with it. Like, I will always kind of feel like things as a software developer or a little less rigorous than kind of my statistician brain wants it to be.

Simba: It's funny. Like for me coming from distribute systems to machine learning, I kind of was seeking… I didn't like when there were clear answers. Even I did doing machine learning, I preferred recommended systems because there's such thing as a perfect recommendation, like where just serendipity, what does that mean? You can't really measure it the same way you can measure everything. So anyway, I very much emphasize of like wanting that complexity and the problem and the gray space and the problem and making things less, I guess clear and cut and just… it's kind of the same reason why people go into startups too. Just have a little bit of maybe more excitement and just push yourself more.

Bernease: Yeah, totally. I totally get it. I think I have a bit of both sides, but yeah, there's definitely a stats side. That's like, okay. I'm okay with uncertainty, but like it it's quantifiable in certain [crosstalk 12:09]

Simba: P value where… so I guess your research now is in the AI observability, the start you're working on is related to observability. I guess your research is more evaluations data, but still, it's kind of very much tied. Why did you pick this problem to work on in particular?

Simba: Yeah. So, I mean, I think you brought up a good point. Evaluation, it's very much tied to kind of all of what I think about for MLOps, right? We certainly have a way that we do evaluation when we're training a model. We train a model, we take some holdout set and then we measure this evaluation metric on it once, maybe, you know, if we're really fancy, we might do some statistical tests to compare two models to see which one we want to use and whether or not it is indeed better, but then that's it. And so even as a researcher, I thought, well, evaluation means a lot more than this, right? And we can't always have these ground truth labels and we really shouldn't always trust the ground truth labels. They're never always perfectly true.

And so maybe there are different ways to think about evaluation, you know, under drift under like other situations with maybe we have kind of weak labels and how do we use those together? And I think that's a natural fit with kind of the problems you think about for machine learning and production, or really just in industry in general. It's not going to necessarily be enough in the academic sense where if you can get those labels, you kind of want to prove it with the same tools that the people before you have. But I think industry really motivates this idea of, well, how do I get it working with these new tools and these new mindsets I'm evaluating?

So, yeah, that's a large part. I think another reason that I really like working on AI observability and MLOps and just kind of in industry in general is that I think you can have a lot of impact. So, I think a lot about kind of fairness and bias in machine learning. I think we all think about this a little more in the last years as kind of it's become more popular and talked about.

And I think one day I sat back right before I joined WhyLabs. And I kind of thought, well, if I'm writing these academic papers, even in machine learning there might…I might get lucky and write a paper that lots of people see and impacts the industry, but the companies that are doing these things that are producing new products, they have to have those best practices and tools in order to make the world a better place, kind of more ubiquitously. And so, my thoughts were really, if I changed the way that ML is practiced in industry and the way that these systems are maintained, I can have much more impact as well as doing the stuff that I like to do more.

Simba: Yeah. That makes so much sense. You very much are kind of living what you preach and that you're very much your current work, very much ties being able to make impact on research side, but more specifically also working on the startup and actually bringing this stuff to market, actually making impact in the real world. And so, you joined WhyLabs in particular, and I know we talked a little bit about this before, but you have kind of an interesting story of how you joined. I know WhyLabs has a unique perspective on observability, which I really want to dig into, but maybe we could start with just a story. Like how did you join WhyLabs? How did that all play out?

Bernease: Yes. So, the CEO of WhyLabs, Alyssa, and I worked very near each other when we both worked at Amazon. So, my team, as I mentioned was the team that kind of took all of the skews that we sold at Amazon and tried to decide how much should we hold an inventory at any given time. And so, it takes a number of inputs. In fact, it's kind of a large optimization process, but there's some statistical modeling that you have to do before that optimization. And one of the large inputs is the demand forecast. So, there was a different team that did demand forecasting at Amazon. And that kind of spun out of, I think the ML core concept that was going around companies, and I think still exists in companies where there's one team that does kind of more pure machine learning stuff and helps to kind of add that places.

So, the demand forecasting team certainly that needed that. And Alyssa was a huge part of that task as well as another person who came to join us in WhyLabs. So, they were kind of the primary input to my team that does this optimization. And because as I mentioned, it's kind of an ephemeral task. We do a ton of calculation and we just really at the time, we're like, we have no hope of storing this, so we didn't store it at all. We tried to make a system that was stateless and recalculated if we need to. The demand forecasting team posed a challenge to that because even though we wanted to be stateless, we wanted our inputs to kind of store the values that they changed. If they send us a new value, that means I can't go back and debug the system that I made a decision on yesterday because they changed the value and didn't let us know.

So, I say all of this to say that Alyssa was kind of the head of a team that was my adversary at the time because I needed them to kind of deal with persistence in a different way than the way that we ended up dealing with it. So, I didn't know that, but eventually I left Amazon and I worked at University of Washington at the E-science Institute. And I continued to work on these things related to kind of evaluation and bias. And Alyssa was at UW at the same time. And she was also very interested in these sorts of topics. And how do you build a company around this, when we both kind of agreed that the way to make impact is to kind of build best practices and share tools with these best practices to other companies. So that is how we met. We met in the more like nice time, but I can imagine we would've liked each other a little less had we met when we actually impacted, when she actually had more impact on me than at UW.

Simba:

That's so funny. It's funny how, like you can never predict how things play out. I have like a similar story where my manager from Google, when I was an intern there just like a very random project, like that had nothing to MLOps. Now, also leads a venture back MLOps company. And it just like if you told me band that was going to happen, I would look at you very, very confused. And it's just funny how these things play out. And I guess it shows like never, you know, just be nice to people. You never know, when you're going to run into them again, it’s a good life philosophy.

Bernease: Totally. And I think if the way that you see the world or the industry is similar, then you'll end up finding yourselves in different spaces. Even if they're different from each other, depending on the environment, you'll kind of shift toward the same sorts of solutions.

Simba: Yeah. And we see this very dramatically in MLOps right now in that, you know, it feels big, like it's a thing that's happening, but as even doing this podcast, it's being in the community for so long. You start to realize that it's actually quite small. There's a lot of people doing MLOps, but there's in terms of like, who’s kind of driving the category, driving direction. It's not like everyone sort of knows each other has heard of each other. And it's kind of a crazy thing sometimes because, you'll like I've had it happen where someone will start talking about the project and I’ll kind of name the person like, oh, you're talking to such and such. And they're like, how do you know that? It's like, well, there’s only so much. It feels like there's a lot going on, but when you're in it long enough, you got to start to parse it all and you have to have a good idea of who's doing what, who's thinking of what

Bernease: And you know how like different people think, like, I think sometimes you can read a blog post or something and you're like, oh, I know who wrote that because that's the way that they think about this problem. So, another thing that kind of drew me to WhyLabs and like to this particular solution is I really like technical complexity. I think one thing I really like worried about going into tech kind of in general, coming out of college is like I do really like stats. And as like, I wasn't a great entrepreneur at the time. I think maybe because I liked the complicated technical solution more than I liked maybe the simple solution and if it would solve it. So, that's of course not great. But when you can find an example where I think they both converge, then that's amazing for me. So, one thing that we really ran into kind of at the beginning of WhyLabs is, well, we have two options, you know, maybe multiple options, but two of them were in our mind at the time. One option would be like another, a number of other solutions have some stream of the data or have like the data uploaded to us by API. And then we can run all of these calculations on it and do monitoring and do lots of other stuff maybe in the future.

The other option that we were considering was just scale. I think because Alyssa and I both had that experience of like this ephemeral data set, a lot of the founders of WhyLabs and early employees have worked at Amazon. So, we've been used to this like massive scale for a long time. One thing that we realize is just…there are some situations where you're never going to be able to upload all of the data that you process that, you know, all of the queries, even for your prediction models in full fidelity.

And so maybe can we think about a way to kind of do that processing on the client's machine, in the actual place that the data already lives and then take the results of that and pass it up to the company. We're really happy that we went that approach. I think it's a very kind of privacy first way of doing things like we never see the raw data of a customer and that's super helpful for kind of those conversations with customers as they try to go through their security review, right? Like we never saw your actual data and we just see aggregate statistics, but it also is really challenging. So, we have a number of challenges. If we had the raw data, we could do correlations way easier, but if we don't have the raw data, then we have to both come up with ways to approximate these sorts of things kind of really quickly and in a low amount of space, as well as kind of make decisions about what things we need to collect beforehand. It's a little harder to go back and make those collections later.

So now we have a mix, so we never upload raw data. But in addition to just kind of collecting metrics, you have to collect things like distribution for the data. So, if I have a statistical distribution, now I can uncover the histogram. However, I want to chop it later, but it's still approximate, all of this is approximate. And so that is kind of the tradeoff that you make. But what that means for me as a person who likes complicated statistics stuff, is that it's just way more fun to be a statistician at a place that has all of these complex kind of data sketches and statistical innovations that we have to do. And I think anyone who loves statistics is kind of like a pig and mud, I guess

Simba: I was looking at the product, we talked a little bit before this and it is… I guess it's like, it's so good in this category to like have all these different takes. And it's really interesting. Like, you know, like you said, the WhyLabs philosophy comes from what you saw on Amazon. So, a lot of like parallels and different parts of the stack, like in our case, one thing that we thought was funny or are interesting, was that a lot of large companies, when they train their models, they actually don't train their models on all their data. There's just too much data. They always sample, but a lot of people never had thought about that because they haven't worked at that size of data set. So, they hadn't realized that the problem isn't actually, we don't have enough data. It's more like the quality of things. It's not so much the quantity of, of data; we have more than enough.

And so, it sounds like a very similar problem where it's kind of like sure. If we have all the data, we could do more, but we can get 90% of the way there, 10% of the kind of operational cost of having end, opposed having to literally push every single thing up and kind of maybe more broadly. I'm curious, what do you think most people get wrong when thinking about MLOps broadly or maybe even just observability in particular?

Bernease: I mean, I think it fits into exactly what you just said is that we have a specific kind of approach of how we learn these things. There's kind of a general approach that you learn about machine learning that involves kind of a small data set that fits on your machine that you process maybe even over and over again. And so, I think the thing that would get wrong is the assumption that you're going to have a particular setup or that you have particular priorities. I think for some folks, one big example that we've run into a bit. And I imagine lots of other observability tools that kind of do monitoring will run into you is that you have some people, some customers who have a way to get kind of ground truth even for their production system.

So, for example, if you have a model that does kind of forecasting or prediction over time, the cool thing about that is you get your ground truth. Once that time has passed, then the actual answer. So at Amazon, like if we predict how many customers are going to buy this thing on Tuesday, maybe on Monday, we don't know the answer, but by Wednesday we can actually evaluate our model in real time. I think that's a setting that we don't talk about a lot. And then I think we also don't think about the opposite. There are lots of other casts where getting ground truth means like sending it off to you, a company that does labeling or mechanical Turk or one of those places. And there's no way that you're going to do that every day or even every month. And so, there's lots of little embedded factors that are in MLOps and I don't think anyone's really enumerated all of them. And I don't think that, you know, any person kind of thinks about all of those things when they're making a tool or just making decisions for their own company.

Simba: Yeah. I really want to expand on that because it kind of comes up a lot, especially in the almost circles all the way back to the DevOps sources, MLOps where a lot of the DevOps stuff does a subset or maybe even a super set, but it's a generic super set of some of the MLOps tools do. And observability, I think, falls into this because, you know, dev tools we have [inaudible 26:25], we have inaudible 26:27], we have generic observability tools. We have them for a while, we know they work, but you know, MLOps instead of observability tools with DevOps. Obviously being one of them I guess, why does the category need to exist separately if the DevOps observability category?

Bernease: Yeah, I mean for me it goes back to this idea that metrics aren't enough necessarily, and that all of the DevOps tools you can really… and well, I can't say that there aren’t rare situations, but the average DevOps tool can really live in a world where you have some metric and you monitor that over time. And the world of machine learning, you not only care about kind of the availability of the data, but you care about the actual complexity of the data. And to do that means that now we need to store the distribution of the data and now, we need to understand the correlation of the data versus some other type of data. And the tools just aren't set up for like such a complex kind of input to track over time. And then also, when you want to go explore that, you need tools to be able to dig into kind of that complexity about the correlations and the distribution and that sort of stuff. Not to say that there isn't a world where DevOps tools also expand to MLOps, but there’s just some complexity there that's so specific to data and models sometimes.

Simba: Yeah. That makes sense. This idea of like the data, it's a different beast in ML than it is anywhere else.

Bernease: Yeah. I imagine you have the same thing at Featureform, right? Featureform. I think people don't really understand why it's important to have a vector database because you’re like if you don't think about the cursive dimensionality and like all of these things that come in that are so specific to not only just data, but like the way machine learning works and like how highly dimensional that data is. It's then when you understand those things that you can start to see the value of specific kind of ML purposeful solutions.

Simba: Yeah. We like embedding hub, but we definitely see that just like you face this problem and then it just, duh, like I need this. And like it's kind of almost bringing together a lot of different components, which I think a lot of MLOps is. It's like these things exist, but they exist separately. This is how, I mean Featureform's actual feature store where we have this virtual feature store approach, which will not go too deep into in this podcast. But I think that the thing that's interesting that we find time and time again, is that a lot of the problem sets is more of an application problem. It's like the infrastructure exists that's existed for a while, but how do we turn this into an application layer for a data scientist and observability fits into this where it's like, sure you have all the different pieces. Like you can build some basic way to get distributions. You can do metrics monitoring, you can get latency in different places. You can do all this other stuff. But the problem is that you have to time all together and you have to make them useful from a data scientist, you have to fit it into a workflow. And there's also the organizational components of if you have large teams or multiple teams or multiple models or models feeding into models, like there's all these cases where you just end up kind of band aiding another feature on top of another feature on top of another feature where if you start with the right core, even if you use all these tools that already exist, like in our case, like we don't build our own storage engine, we use Redis and Cassandra and other tools. And we don't build our own transformation engine. We use whatever our clients are using.

So anyway, a long way to say that there's the application layer and the usability layer and organizational layer. And I've found time and time again with the organizational problems of ML are almost more painful and harder to solve and honestly, less fun to solve. Your know, IM rules is not nearly as fun as trying to build a better database, you know, but from an organization, probably the IM rules is going to give you more value than having a slightly faster database.

Berners: Yeah, absolutely. And I think machine learning and production is just so different from kind of the tools that have already developed in the maybe machine learning space more generally, right? I can't go a whole week without someone talking about Jupyter notebooks. Right. For a good reason because Jupyter notebooks are really easy and convenient for folks. I am definitely having the opinion that you run into a lot of things when you try to take the notebook itself and put into production. But I understand why people want to do that. It's now embedded into kind of the workflow of every data scientist. And we have to think about like that part is important. It's not just the underlying tech that application layer, as you said, is, is a huge part of this and that is related to the culture and organization setups of the companies.

Simba: Yeah. We actually had a talk about this at one of our recent round tables. I think it was one after the one we met at… we talked about Jupyter notebooks in particular. And I'm curious to hear more of your thoughts on it, but the argument that was interesting, there was two arguments. One is that Jupyter notebooks are entirely for explanation, things should be and it's just like that is how it should be done. People from the data science perspective are funny because they had the op, they were just like that sucks. Like why do we have to do things twice? You know? And then from the data science perspective, they were hoping like, Hey, I just want to work in one way and just have it magically work in production. And I'm curious like, how do you think maybe this is almost like the perfect MLOps workflow, but like where do Jupyter notebooks fit? Are they like, should they kind of go away? Should they be extended into something else? Are they just an explanatory thing? Like how do you think about Jupyter notebooks in the process?

Bernease: Yeah. So, this is a great question. I think it pulls out a couple of things. So, I love Jupyter notebooks. I'll say even in my academic role, actually I worked a lot with a number of data science institutes. One of those being the Berkeley Institute for data science were Fernando Perez who created I Python and Jupyter notebooks started. And that's where the Jupyter project. It sat at least at some time, I don't know where's sitting today, but so I will say I love Jupyter notebooks. I don't know where it fits into this process, but I do believe that data scientists kind of have to be at multiple parts of this pipeline. So, it's clear that data scientists or machine learning engineers, well maybe we'll say data scientists are the ones to kind of create the model initially, do the analysis that kind of starts the creation of the model and then it gets deployed in production. I'm probably not a fan of using that notebook to do so, you know, notebooks have lots of problems and that you're not forced to run them in a particular order. Like there's lots of things that don't make it feel like the safest thing to run in production. Although, people have come up with solutions to fix that.

The other thing that we don't talk about enough is that a data scientist is also really important in root causing issues with the model. So, there are some issues that are like about availability. If you get no data or you get the same data, the same value for every query, then maybe that's a problem of like the system somewhere or the database or something like that. But there are some problems, many problems where you're getting a prediction, but the prediction is wrong. This is going to largely remain data scientist’s problem. And so that means that they have to be on both ends. My thoughts are that really having some way to pull out some data and getting that data, maybe back to the data scientists environment to do that analysis is a great way to go. Maybe the answer isn't doing that at full scale, maybe that's because you can take some aggregate statistics or something like that or connect a notebook to the larger database in some way. But either way, a data scientist is going to have to kind of come back to do the root causing of other problems during production. And so, for that reason, if they're comfortable in notebooks and that's the right place to do this kind of statistical deep diving, then we need to find a way to integrate notebooks further down the pipeline.

Simba: Yeah. It makes a lot of sense. And I think it's funny how sometimes MLOps and like actual data science workflows are so disjoined. It's almost like people are building MLOps without taking the perspective of what do people do today. And it's something I should be talked about more like kind of creating that conjunction. And you know, you came from the statistics and continued to like work in that realm. So, you have a very interesting perspective on it, which I'm sure really helps with the solutions you’re building for observability. There's a million questions I can ask, but kind of make sure our listeners have resources to continue learning round a lot. There's so much information out there. Some of its contradicting, some of it makes no sense, some of it's like outdated already because it’s been four months and that's how long things can last sometimes in ops. What are your favorite resources or just a resource to learn about MLOps?

Bernease: Yeah. So, there's a couple, there's the MLOps community. They have a slack channel that is very active. I can't admit to being able to follow it very well because of time, but it’s really great. Everyone seems super nice and you can reach out there. There's a number of great websites. So, there’s mlmproduction.com. There's maiden ML which is great training lots of people talk positively about. Another thing that I really think has helped me a lot is the tech blogs of a number of companies. So like Stitchfix and a lot of large companies that have these tech blogs that talk about their solutions. I find those to be super helpful because they tend to be in a specific context. So especially if it's a large company that you kind of understand how Netflix works. Maybe you can understand what the Netflix problems would be. And so, it's really helpful to me to kind of read those technical blogs.

Simba: Those are amazing resources. I personally read and follow a lot of those and it kind of goes back. The MLOps is a small world. I smiled a little bit when you named all those things that I can like think of like every individual person is like helps write those things or helps run those communities. So, it's awesome. And I think it just kind of boils down to like just, there's a lot of ways to join the community. It is still a small community and I think just joining and just like listening and there's just so much osmosis that comes an ideation that's coming from all those resources. Bernease, this has been such an amazing chat. It's always great to chat MLOps with you. Thank you so much for taking the time and sharing your insight with our listeners.

Bernease: Yeah, absolutely. And I'm bummed, I missed the most recent round table. I missed my flight that day and so, I was dealing with that, but I'm excited to join you there too.

Simba: Yeah, that's awesome. And then we'll include links to our round table, include links to places to keep up at Bernease and get any content she puts out and then WhyLabs puts out in the description of this podcast. Thank you all for listening and hope to be able to chat with you all next week.

[music]

Rethinking MLOps Observability with Bernease Herman

MLOps Weekly Podcast

Related Listening

MLOps and Feature Stores in 2025 with Ben Epstein

Bridging Software Engineering and MLOps with Paul lusztin of Decoding ML

From Recession to Al Boom: Venture Capital Perspectives with Gautam Krishnamurthi

Building the Future of ML Platforms with Ketan Umare

Ready to get started?

PRODUCT

RESOURCES

COMPANY

PRICING

DOCS