Simba: Hi, I'm Simba Khadder, and you're listening to the MLOps Weekly Podcast. Today, I'm chatting with Chaoyu Yang, the co-founder and CEO of BentoML. Before starting BentoML, he was an ML engineered at Databricks where he lead the design and development of multiple core products and Databricks’ unified and analytics platform chat. Chaoyu it's really great to have you on the show today.
Chaoyu: Thank you for having me, Simba.
Simba: I like to open up and start with just asking about your journey into MLOps. How did you get into MLOps?
Chaoyu: Yeah, I actually spend most of my career working and building development tools for data scientists. My first job was actually an early engineer at Databricks. At the time the team was building a unified analytic platform on top of Apache Spark and yeah, the team was building some pretty cool projects around managing spark cluster on AWS; also providing the workspace tool and orchestration tool for that whole end to end data science projects. Spark at the time was already very popular and widely used in the industry that kind of gave me a lot of exposure to how people are working and building data products and using machine learning to solve some very interesting problems. Definitely took a lot of inspiration from that experience, which led me to starting BentoML, which is an open source project for model serving and deployment.
Simba: So you were early at Databricks and you mentioned you're building unified analytics on top of Spark. I know I've seen MLOps even just the term proliferate over the years and the way people think about it has changed over the years. And nowadays it seems to be a focus for a lot of machine learning teams and data science teams. Why do you think's changed over the last few years?
Chaoyu: I think the biggest changes, probably some of the standards and kind of typical architecture and workflows are become more well-known across the industry. Many teams are more aware of the term MLOps and thinking about how do they empower their data scientists to ship better models faster? How do they ensure a reliable and high performing system to make sure their ML models actually are delivering business values? So I think the biggest change is more and more open source project become more mature and some of them become standard in the industry.
Simba: Why do you think the open source projects kind of started to come about in the last three years or so? Was it just awareness? Was it just people didn't realize everyone was doing all these things, or what do you think changed to even cause these kind of this explosion in open source tooling?
Chaoyu: I think that doesn't definitely has to do with the entire data ecosystem becomes more mature. If you look back, couple more years before, people are still struggling with the ETL and data pipeline. There's no good solution for managing the whole data warehouse product and running analytic on top. I feel like as those products becomes more mature, companies like Databricks, Snowflake providing a lot of powerful tools for people to manage their data. More data science team are given the tools to do predictive analytics, and now it's just the right time for them to move forward, to get their machine learning workloads into production.
Simba: That makes a lot of sense. It's almost like this is like the next wave of making data useful beyond just like basic analytics. Like it's actually applied data almost. You mentioned workflows and you mentioned that there's kind of proliferation of tools. I have a ton of questions for you about this. I'm really curious to get your perspective on, but maybe the start, I know there's no such thing as a perfect MLOps workflow, but if you had to define like what a good MLOps looks like when implemented well, what does that look like?
Chaoyu: Yeah, I think an ideal MLOps solution is definitely highly focused on data science agility, because after all, you're trying to bring the power of data and machine learning to certain business problems and use cases. And in data scientists mind the goal for them is to produce models that are accurate, that have the biggest business impact and enable them to quickly iterate, get their model deployed, understanding how their model performing and quickly iterate through that process and improve their models is the key. So in my opinion, MLOps should be focused on empowering data scientists to be successful. And of course in that process, since you're touching the production workloads, you want to make sure your services are reliable. There are bunch of kind of collaboration with engineering team and Dev Ops ops team from our view, those are really supporting roles in this MLOps ecosystem. The primary focus should be empowering data scientists that's what kind of ideal MLOps would look like.
Simba:I love that. Yeah, I think sometimes we almost like lose the baby with the bath water. We just get really caught up in doing MLOps for MLOps sake and forgetting the goal, which is to make your machine learning team more productive and reliable and you know, everything that comes with that. We talked about the MLOptic system, we talked about the explosion of open source projects. One thing that I get asked a lot and think about a bit is platforms versus best in class vendors. As in there are some MLOps, either internal teams or even vendors who try to build MLOps end-to-end like a full platform that kind of does everything. On the other hand, there are a lot of vendors who kind of are more focused and try to provide almost best in class tools that the end user will have to stitch together into a platform. Does one make sense in some cases like how do you break those up? If I'm listening to this and I'm deciding if I should use a full platform or stitch together best in class tool lens, what should I be thinking about?
Chaoyu: I think from a lot of buyer's perspective end-to-end product, it probably makes a lot more sense because you, you kind of have a lot less operation overhead. You get one solution that just works end to end. But I think in practice especially when it comes to MLOps, it has so many different components and teams have very different needs at each part of their machine learning pipeline and workflows. I definitely think there are a few big components, a standalone can be a platform. For example, I think there are definitely a place for a platform that focus on just model development and experimentation, a platform that's kind of playground for data scientists to experiment with different type of model masses and try out different features and kind of look at all like experiments results and select the right model. But I think when it comes to the production side, there comes to a very different set of needs. That's kind of why we are focusing on the survey and deployment part, I believe that's going to be a platform…That's whole set of different problems, that's related to models in production and continue to experiment and iterate models in production. I believe there are a couple other examples. I believe like future store or monitoring those are already great examples of where a single point solution that can really well integrated with the rest of ecosystem makes more sense.
Simba: It's interesting to watch it play out. I feel like it used to be very platform centric and now even the platforms are almost like unbundling themselves, a lot of the original platforms now offering point track solutions and feature stores and you know, everything else that you can kind of pick and match. So let's talk about Bento. What is Bento for people listening, who don't know.
Chaoyu: BentoML is an open source project that focus on machine learning models, serving and deployment. It provides a really easy to use API for data scientist to package your models and build services, machine learning services around your model and a set of tools that makes it super easy to get your models deployed into production. The goal for Bento is really provide this standard, that and shared language between data science team and engineers who's running machine learning workloads in production. Bento is actually a concept in BentoML that represents kind of unit of deployment that packages all the machine learning model code dependencies into one standardized format. And our users can turn Bento, build it into dock image for deployment so that production, Dev Ops, best practices built-in. We also build tools on top of Bento, for example, deploy Bento directly to Kubernetes Cluster right at scale, or get Bento directly deployed to some of the cloud services such AWS Lambda or SageMaker.
Simba: We talked about monitoring, we talked about feature storage. We talked about platforms. There's a ton of, I guess, surface area and MLOps. What made you decide to work on serving in particular?
Chaoyu: I think serving is one of the problems that data scientists frequently run into once they get the model trained and remitted locally and have some basic evaluation results, getting model to production, help them to actually validate many of the ideas and get a good sense of how well the model is performing for their particular use case. Since that's one of the first problem people run into, once they are thinking about getting model to production, we just felt that's an important problem to solve. And I think three years ago when my co-founder Bo and I started looking into the MLOp space, there were really no good tools for people to build a right model serving system. Bigger companies who can hire a really big group of machine learning engineers usually build in-house solutions that are built specifically for one of their use cases. But for the rest of world, data scientists are struggling to put their model into production. We've seen many teams trying to use web technologies like Flask, fast API to get their model deployed, but then run into all kinds of operational challenges. And also they're not ideal in terms of performance and reliability. We see those problems that just desperately needs to be solved for machine learning teams, so we chose to work on the survey and deployment problem.
Simba: That's awesome. That makes a lot of sense. And you mentioned, people have in-house platforms nowadays there's many more serving tools. Some are proprietary, some are open source. What makes Bento different? Why do people choose Bento over other tools? When does Bento make sense? And when does it not?
Chaoyu: I think in the market today, there are a number of tools that help people to build a serving solution, but none of them are really built for data scientists. One example is you got some of the model server products such as TensorFlow Serving, which is really just a run time around TensorFlow model and provide a Tensor-in, Tensor-out interface in order to build the serving solution that actually makes sense for data scientists workflow. You typically need to build a number of additional components around TensorFlow serving to have a meaningful serving solution. BentoML on other hand kind of focus heavily on enabled data scientists to quickly get their model deployed. One example in comparison is we provide the Python interface for data scientists to define their serving logic. That includes the pre-procesing post process, logic business logic, or batching features as well as how does the serving logic as, as a high performance model run time.
Simba: You talked about in-house versus open source and then vendors, I guess. When does it make sense to build in-house, when does it make sense to go a vendor? How do you think about that?
Chaoyu: I believe in the market today more and the more open source machine learning tools become so mature, there are really less and less reasons for teams to build their in-house solution. Certainly there are some very special use cases that comes with really strict requirements on certain aspect of the serving system. But teams are aware that it's very expensive for them to rebuild the entire serving system. Most of the time, they probably still are going to use some sort of open source components in that process.
Simba: Some vendors decide to say proprietary, some vendors are open source. You all decide to open source, obviously. Why did you open source and how do you think about open source as part of your strategy in Bento?
Chaoyu: BentoML, we are a startup company and we build the open source project, BentoML, and YATAI and Bento CTL, a number of open source project for free for our community. I think quite a lot of startup these days, especially the ones focused on infrastructure developer tools for enterprises are actually going with this open source strategy. Whereas we first offering open source product That's really of providing values for, for developers and get a group of community who really help us to improve and contribute to the project. And on the other hand, they're always going to be team who are struggling with operating these open source products in production. That's where we got the opportunity of providing a managed service of our open source products and offer additional kind of enterprise or security features around open source product.
Simba: Can you share an, interesting story of maybe it could be a user of Bento or just MLOps in general, could be just about someone serving maybe before they were using Bento. I think it's always interesting to bring some real world stories about MLOps. Lots of times when we talk about it, we talk about it in a bubble of like, if you were to do everything perfectly, here's how it would look in reality, it's a lot more messy. What are some interesting stories you have that you can share?
Chaoyu: One of our customer was previously building in-house solution. They were using TensorFlow serving server and then have another pipeline that does pre-processing and post-processing for the model server itself. One of the problem they run into was as data scientists keep iterating on their model, they kept changing some of the pre-processing logic and feature transformation code. But on the engineering side, they see the Jupyter notebook and the saved model files. And once they built that production serving system and got a new model from data scientists, the first reaction was to just upload the new model to the TF server. But only a few days later, they realized the pre-processing code was not using the exact same version and the entire prediction pipeline is just producing garbage output. Once they discover Bento, they really see the benefit of having this standard for teams to describe their serving logic in one place and deploy that as one unit.
On the other hand, all the optimization and performance related improvements within BentoML’s internals, really help them to get better performance out of their serving system. So it's a big win for their team and their data scientists are even happier, because they will be able to get their model deployed to production much faster. For this particular customer, before switching to BentoML getting one model deployed, could take them up to two months and they need dedicated engineering helping to build that end-to-end solution deployed. But after switching to BentoML, any of their new use cases within the team can get a new model deployed within a day or two.
Simba: That's super interesting. It's kind of interesting to see how many problems tend to fall into kind of metadata management across serving and models and features and a lot of the components of the machine learning pipeline, taking a step away from survey, what are you most excited about in general in MLOps space?
Chaoyu: The MLOp space is growing so quickly these days, we keep seeing new tools and new architectures coming up all the time. One trend that I tend to notice is more and more teams really benefit from deployment tools like BentoML, and YATAI, they are capable of getting new models deployed very quickly. One thing that I'm extremely excited about is what comes next; as teams get capable of getting their model deployed and in production more quickly, people will be able to iterate on their development workflow a lot faster. What comes next for data scientists is more about understanding how their models are performing in production and when do they need to maybe retrain the model, but as more and more models get deployed automatically into the pipeline continuously, I think there's less and less need for a tool that's kind of detecting just the drift detect or monitoring. Put more focus on to kind of understanding how their model are performing online and do experimentation online. And that's kind of continuously retraining of the models.
Simba: I guess one way to think of that is kind of the movement from collecting data. It's almost funny because it's almost like the same analytics to machine learning flow that we talked about where, you know, step one, like collect your data, step two, provide analysis of the data cache, like when things are broken, and step three is actually proactively use the data. So it's funny that, that same kind of flow is why MLOps exists. And it's almost like at this point, most MLOps companies don't do any machine learning. We don't have anyone in-house whose entire focus is actually building models beyond just like for getting our product to be better. Our feature form itself as an example and I think Bento too, we don't actually do machine learning. We help teams that do machine learning. We solve lots of the engineering and data problems and Dev-Ops problems that come along.
It will be interesting to see how the tools themselves start to almost incorporate machine learning within themselves with monitoring seems like one of the most obvious places to start things like when to retrain experimentation. I know one thing that we built at my last company, which I don't know if it's a vendor that does this, but we built kind of AB testing tool of models, so whenever we'd create a new model and we felt like it was good based on certain metrics we had and how it did in training, it would automatically go into a Canary deployment in that Canary deployment, we'd give it like 1% of users would be interfacing of that model. And in general, we tried to keep like kind of a variety of models in production. One, it was a recommender system. So for that context, like recommender their systems are interesting, because there's no such thing as like the best recommender system, they kind of all have pros and cons because the goal is to provide some sort of serendipity, which is almost a possible measure.
But yeah, it's interesting because you know, in that situation we were doing machine learning on MLOps. Like we were using the data that we were generating to actually, you know, automatically make smart decisions. And I think it'll be interesting to see how that level of ability kind of makes it swing to MLOps tools. I think part of us not being there yet is probably just stage dependent; we're just in terms of like pyramid of needs, before you start experimenting different models, you probably need to deploy your model. And before you start using data to, to experiment with the models, you need to actually collect monitoring data of the models and have a performing how the features are changing, how drift is happening, everything that comes into that. And then once you have that down to a science, you start to automate what your data scientists are doing today and that almost builds like a full flow. What do you think of that? Have you seen stuff like that before?
Chaoyu: Definitely some of the more advanced teams, I've seen them building something like that. Just thinking about from a team that just starting now to adopt machine learning today, there are so many nice building blocks that they can quickly adopt and use to create that solution. And I think maybe a year from now it's more of the tools in this space becoming mature and what they adopted. Definitely there's possibility of more teams can easily get to that state of continuously improving their models and do more interesting ways to find better models in production.
Simba: Yeah, it’ll be really interesting to see how the ecosystem also starts to play together because to do lots of this continuous improvement, you kind of need to start tying together tools. You either need to take over the whole stack in which case you have like a full MLOps platform that can do everything and also is super, super fine-tuned. I don't think that really exists today. Probably won't exist for a little while. So the next best thing is kind of the best in class vendors that work together and beyond just making it easy to connect each other, which is like step one, which is already super hard. Like most vendors, it's still a little bit awkward to get all of them to work together, because there's no standards yet in MLOps. They're very basic and they're very ad hoc. Whereas, we just are still in such the early days, but it'd be interesting to watch as these teams integrate. I know like we're doing some stuff together with Feature Form and Bento, you know, we're working well all the teams that…You guys are working well and kind of our teams are trying to kind of tie together the ecosystem more closely. But what gets me really excited about this is the second order value, like beyond just like future store and serving as an example or serving and monitoring as an example, like having them work together easily is step one. Well, step two is by connecting these two pieces like one plus one equals three and I think it's just barely scratching the surface of what's possible there. And I think that's where you'll start to get kind of MLOps Stacks. Like yeah, we have like a Feature Form Bento monitoring solution stack and whatever, and that provides these values. So it'll be interesting to see that play out.
Chaoyu: Definitely excited to see how that played out. There's so many exciting tools in the space and Feature Form, BentoML are both growing quickly and becoming more and more mature, definitely interesting kind of MLOps Stack for people to adopt and see what community are going to build on top of that stack.
Simba: This has been really interesting conversation. It's been cool to kind of talk about MLOps broadly. I'm always interested to see like how the different categories are changing and evolving over time. Serving obviously being one of them, maybe one way we could end, which would be interesting is could you just leave like a tweet length takeaway? Let's say someone's listening, you know, very in an MLOps function versus getting to the next stage of MLOps, they're trying to think of their TLDR. What would that be for this podcast in your opinion?
Chaoyu: The biggest takeaway for me is MLOps function should focus on data science agility and empowering data scientists to iterate faster and shift better models. The MLOps space is changing quickly and there's so many tools and components in open source space that works nicely together and provide more building blocks for teams to, to achieve that MLOps goal. Really excited to see where that goes, and definitely excited to see kind of more deeper integration between the two communities Featureform and BentoML.
Simba: Chaoyu, it's been so great to chat with you today. Thanks again for hopping on.
Chaoyu: Thanks again for having me.
From overviews to niche applications and everything in between, explore current discussion and commentary on feature management.