This week on the MLOps Weekly Podcast, Simba chats with Jacopo Tagliabue, author of "You Do Not Need a Bigger Boat" and Director of AI at Coveo, to discuss fundamentals when choosing an MLOps toolchain, the problem with end-to-end platforms, and solving technology problems vs workflow problems.
[00:00:03.320] - Simba Khadder
Hey, everyone, my name is Simba Khadder, and you're listening to the MLOps Weekly Podcast. Today, I'm really excited to be speaking with Jacopo Tagliabue. He's most popular in MLOps community for his GitHub repository, You Don't Need a Bigger Boat, and a lot of his content about ML at reasonable scale.
[00:00:24.380] - Simba Khadder
Most of his career was as a data scientist across a variety of different organisations before he started his company Tooso, which was Search-as-a-Service. He acted as CTO and took it all the way to acquisition by Coveo, where he was most recently the Director of AI. As of now, sounds like he's onto his next thing, which I'm sure he'll be sharing more about soon. Jacopo, so great to have you on the show today.
[00:00:46.390] - Jacopo Tagliabue
Thanks so much for having me, and thanks for everybody to join us.
[00:00:49.550] - Simba Khadder
I'd love to get it from your perspective and your words. What got you into ML, what was your journey like to get here?
[00:00:55.310] - Jacopo Tagliabue
The journey to modern MLOps was basically the realization of sucking in the entire tool chain by myself in the old MLOps world, when there was not even MLOps as a word. If you remember my story, I used to be the CTO and Founder of a startup in Silicon Valley.
[00:01:11.250] - Jacopo Tagliabue
It was doing NLP for e-commerce, which is an interesting case for ML. It's high value, both in terms of business and financial outcome. But it's also high volume, and it's something that is very crucial for an e-commerce shop. You can't really run an e-commerce shop without recommendation or without intelligent search.
[00:01:28.440] - Jacopo Tagliabue
Things need to be up all the time and the model needs to produce the outcome that you expect the model to do. All the problems that you now see today in monitoring or reproducibility and so on and so forth, but of course, part of what we had to tackle in 2017, when we had to build our entire infrastructure without any of the tool that is now today.
[00:01:47.700] - Jacopo Tagliabue
That's why I'm saying I almost suck at any point in time, in any of the part that comprises a modern data and ML pipeline. The digestion, transformation, training, survey, monitoring.
[00:01:58.940] - Jacopo Tagliabue
Last year, I think, last year or something like that, I went back to all the choices that we made building Tooso, during my tenure at Coveo, and we asked, "What if we build something like that now with all these tools? What will we keep?" And guess what, it's nothing. We're to build basically completely different toolchain, a completely different set of tools and capabilities.
[00:02:18.390] - Jacopo Tagliabue
Then we open source that in the famous You Don't Need a Bigger Boat repo for everybody to use it, which is the idea of this blueprint of an ideal AI company that basically produce backend for models. And the realization that what took people and months five years ago, now it can take literally one person and days to achieve the same level of capabilities.
[00:02:40.580] - Jacopo Tagliabue
So I can't be more thrilled to live in this time and age when I don't have to suck anymore, and all of the things that I used to suck at before.
[00:02:46.770] - Simba Khadder
It's so funny to look at... I actually have a very similar story to you. I actually built Featureform, and my last company, we were also doing 10 recommendations as a service more and we were handling like 100 million MOU.
[00:02:58.690] - Simba Khadder
We had something called our data platform, that we built, but nowadays it's a feature store. And MLOps wasn't MLOps. We didn't really have a word for it, it was just like our workflow I guess. We built everything. There was nothing out there. There wasn't really even examples, like a list of DevOps. You could look at Google and be like, "Oh, well, that's the gold standard." I never really felt that we had that.
[00:03:19.350] - Simba Khadder
That repo you mentioned, You Don't Need a Bigger Boat, obviously, if you haven't seen it, if you're listening and you haven't seen it, you should go see it, it's awesome. For those who haven't seen it, could you dive into it? What does that stack look like. What did you figure out looking back and doing that analysis?
[00:03:33.150] - Jacopo Tagliabue
Sure. I think there's a bunch of things that... If you're trying to do an MLOps stack nowadays, there is a bunch of things that are somehow fundamental. As in, without those things you can't even say that you have an ML pipeline.
[00:03:46.590] - Jacopo Tagliabue
The first thing for us, I think for everybody is data, because data is in the end of the day most important part and where the biggest margin of value is. How you store data, how you query data, you normalize data, that's a very important part.
[00:03:57.810] - Jacopo Tagliabue
The second part is training. Of course, you know, once you have data, you have to train a model to do whatever you're doing. If you're not training, you're not doing ML. That doesn't mean there's no value, but you're just doing maybe deterministic stuff for rule base and so on. With a different set of problems.
[00:04:09.840] - Jacopo Tagliabue
Then finally, serving. It may mean different thing to different people, depending if you're doing offline influence, online influence on. But at the end of the day, if you train a model, the model stays on your laptop or even on your cloud, it doesn't make any impact on reality.
[00:04:22.710] - Jacopo Tagliabue
There must be a way after training, for your model to make prediction, again in different ways. So you have to figure out at least three basic minimal pieces to call yourself an ML engineer. You have to say that you have a pipeline in at least after the pieces.
[00:04:35.780] - Jacopo Tagliabue
So for us, going back to our previous experience, we replaced the entire EMR Spark S3 querying base that were with Tooso, with Snowflake. Snowflake wiped out the entire infrastructure and clunky things that we had to maintain in this part time.
[00:04:53.560] - Jacopo Tagliabue
Snowflake just gives you this nice thing when you just store all the data you want, you run a SQL query. You don't really need to know how that is distributed, that is executed. You just get the result back and it's awesome. That was point number one.
[00:05:05.600] - Jacopo Tagliabue
Point number two for training for us was moving from clunky script-based approach to Metaflow. Metaflow is our open source tool. Everybody can use it, it's completely free of charge. It was open-source by Netflix and it's now maintained by a company called Outerbounds. And for us, it was a way to enforce good practices in how you structure your training code to replace undeveloped experience.
[00:05:27.930] - Jacopo Tagliabue
Metaflow by being so pleasant to use invite you to do the right thing and discourage you from doing the wrong thing. Automatically makes your team works a bit better. Now you train, you experiment and now you run all this fast iteration. That's point number two.
[00:05:42.970] - Jacopo Tagliabue
Point number three, which is the one that I feel for us, there's still more work to do on the serving side, that really depends on the use case. Some other use cases are relatively simple in the sense that you can do batch prediction. So you don't really need to have the model that you train live in some sense.
[00:05:59.660] - Jacopo Tagliabue
You can just run prediction against the model while you're still in the batch mode, and then you can store them in DynamoDB and serve them through the cache in real time. It's very easy, but it's very powerful because now thanks to DynamoDB, Serverless and so on, you can do that without maintenance, so that's awesome.
[00:06:14.770] - Jacopo Tagliabue
Then some other models are actually online models, meaning that there's a prediction that is made at runtime. And for that, there's a bunch of solution up there that we explore from the easiest one, say on the SageMaker side to the more complex one now we're working with a company called Exafunction to do more complex stuff, and with other things in between, like try it on [inaudible 00:06:34] and so on and so forth. So that would be the third pieces.
[00:06:37.260] - Jacopo Tagliabue
Once you figure out all the three pieces go together, now you can add all the sorts of fancy stuff that you want. And if you go to the big [inaudible 00:06:43] repo, you're going to see different options for orchestrating this. You're going to see different option for experiment tracking. You're going to see monitoring solution and so on and so forth. But all of that comes into place once you have the first end-to-end version from raw data to prediction, figure it out. Because if you try to do more complex thing until you figure out the first end-to-end thing, you just going to make your life more complex without a necessary pickup side.
[00:07:06.760] - Simba Khadder
I want to jump into the explosion of MLOps tooling. One thing you've said, which I totally agree with is, the goal, especially if you're a startup, is simplicity. You don't want all these moving pieces because it's your job to maintain them. So you want to go things that are hard to use wrong, like, we just work. Last thing you mentioned that Snowflake wouldn't really call them MLOps companies. Now there's been an explosion, all the MLOPs tooling. Well, my first question about it is, it happened about three years ago. It's at least this new wave of MLOps startups came to be. You had that problem years before that, and so we had to do our own stuff. What do you think happened? Why did it start three years ago?
[00:07:48.570] - Jacopo Tagliabue
There's a bunch of things. First, I think the basal practitioners, they used to run ML, they become much broader and went out from the few companies that already know what they're doing. I have friends at fancy companies like Facebook, Google and so on and so forth, and nobody ever complained that they cannot put a model in production. None of my friend ever said any of these sentence. There's a lot of other problems but not this one.
[00:08:13.690] - Jacopo Tagliabue
And then if you go and look at the usual Garda report that everybody cites or most of my Linkedin feed, it's about people complaining 80% of the product or the other project don't make into production or something like that. So there must be something that these people figured out that everybody else didn't, by definition.
[00:08:28.630] - Jacopo Tagliabue
I think the market got to be bigger because a lot of people that are not necessarily at Facebook scale like Coveo, or Tooso, my company, but are sophisticated enough to now need to run ML for the sake of their business, but they're are limited resources or scale of big tech and so there's more appetite for tools that automate some of the things that are not my core businesses, and I don't want to maintain, but they're still necessary for me to do my job. The more AI companies there are out there, the more practitioner need to focus on AI creating value, not so much on maintaining AI stack. And I think that's a huge shift in demand in that [inaudible 00:09:03] sense.
[00:09:04.420] - Jacopo Tagliabue
On the supply side, we live in, especially in the valley, through cycle of excitement and disillusion in many things. And I think there's been a lot of excitement in the VC market. For ML and data companies in the last couple of years, we've seen very good rounds, very good valuations and so on and so forth. So obviously, the field is very optimistic about the fact that this market is going to get bigger and bigger. And so also on the supply side there has been a disproportionate amount of money at disposal for people that wanted to build companies and they wanted to solve this problem. So I guess these two things together conspired to create this perfect wave when now there's like a million tools for ML, MLOps, StatOps and so on and so forth. Makes it a very interesting time to be in the space.
[00:09:45.980] - Simba Khadder
How do you think that will evolve? As in maybe one specific question; there's a lot of companies that aim to be best-in-class categories, solutions. There are companies that aim to be full stack MLOps. There are a lot of people in the middle. It feels like there are some established categories like feature stores. There's a lot of companies that feature service as companies and monitoring. And then there's new companies that almost evade categories. You can't even really pump them into something, just different takes. What do you think the future looks like? Do you think that there will be consolidation to MLOps platforms? Do you think it will be a lot best-in-class tools?
[00:10:22.060] - Jacopo Tagliabue
That's a very good question. And the answer is, of course, I don't know. Prediction is very hard, especially about the future, like the guy used to say. But the general guess is, it seems to be there's a lot of company right now and especially in some sectors, a lot of competition and not a lot of differentiation. So there are entire sector in which is very hard. There's a lot of good tools. But now they're all very similar to each other. So it's very hard for somebody to come in with a fresh perspective.
[00:10:49.460] - Jacopo Tagliabue
These are people that are trying to inject that fresh perspective, as you said, by expanding into more part of the MLOps tool chain. So they're trying to go...Maybe they started with this specific problem, but now they trying to lure you in on other side. But that's a risk, that's a tradeoff you're trying to make. As in, if you're trying to do more than one thing, you may not be the best-in-class in all of this thing, of course. And then there's an argument to be made that a sophisticated practitioner will just pick and choose the best tool and just orchestrate themselves. Which got the lessons of the bigger bot repo. Which again is already a year old. Let's remember this. It's already a year old.
[00:11:21.710] - Jacopo Tagliabue
But at that moment in time, I felt that the best practitioner will pick and choose from SaaS and open source and put them all together. Which doesn't mean it's going to be the same answer in two years from now.
[00:11:32.180] - Jacopo Tagliabue
I will be very skeptical of the entire end-to-end platforms because I think ML workload superficially looks similar. But then at the end of the day the devil's in the detail. So there's a lot of difference between different companies and how they implement that. And sometimes end-to-end platform feels like constraining people to work in one way instead of actually being a platform where people can be whatever they want.
[00:11:53.000] - Jacopo Tagliabue
I'm not a huge fan of the "Hey, this is a tool, you have to do everything in it, from training, testing, deploying and all that matter, whatever. I'm not a huge fan for that as a person. But also, I understand I'm more on the sophisticated spectrum of the practitioners. I also have very opinionative view of how things work and a lot of experience, like how to combine different things. That will be my take.
[00:12:12.050] - Jacopo Tagliabue
Some consolidation, of course, I'm not sure if there's going to be the platform that ends it all. I'm not sure if there's the equivalent of Snowflake, my point for the entire MLOps thing. Because while all SQL queries look the same to a certain extent, a lot of MLOps company, actually, when you go under the hood, they look very different.
[00:12:27.920] - Simba Khadder
One comment I got from one of our advisors was, as a tech company as Devto company, you sell one of two things. You either sell technology or you sell workflow. If you were selling a technology, it's just like, Yeah, I can do this thing that everyone else does, but better, like Snowflake. It's SQL, but it's better than all other SQL databases because of all these things.
[00:12:47.490] - Simba Khadder
Then there's a workflow, which is, hey, what we built, it's not necessarily technically hard, it's just building the right API and choosing the right API. I use Terraform as an example here. Terraform is not easy to build. It's a hard thing to build. But in terms of what they did that I think was the hardest was the right API and the right interface to solve the problem at hand. And I think most MLOps companies are more the latter. They're workflow problem solvers and they're not technology solvers.
[00:13:17.030] - Jacopo Tagliabue
I totally agree with that. I think it's a fantastic example with HashiCorp success. I think that that's a really good point. The analogy somehow break down, as in infrastructure is a very broad thing. The market is gigantic. Everybody's infrastructure. And again, a lot of infrastructure pieces tends to be the same across different companies. They do many different things. ML, unfortunately, is not the same. Even two people doing classification with deep learning-let's even be more specific-but then these two things may actually look completely different.
[00:13:44.360] - Jacopo Tagliabue
Because the truth of the matter, the few people, I think, understand outside of ML is that there's no such thing as the ML that solves all the problem. ML that works a very specific thing, tuned to a very specific problem in a very specific way. That's the only ML that really realistically works. Which means that you need to go deeper into the type of problem...There's a coupling between your ML and your problem, way more than is between your infrastructure and your problem, which makes it hard to build this all-encompassing platform at the scale of HashiCorp, reached in some sense.
[00:14:13.550] - Jacopo Tagliabue
Of course, everybody bets that this market is going to grow. I'm one to bet for sure on that. If the market becomes better, I've invested in it as a practitioner to be here because I'm going to be, again, remember, all of this started because I suck at thing. So the more people build stuff for me, the better I am. But I think there's a distinction between ML companies generally and InfraDev and data companies that somehow gets underestimated even by VC or clients or casual observers of people that never built a model themselves. Basically.
[00:14:41.250] - Simba Khadder
I totally agree with that. I come from a recommender system background. And the way we did things and the problems that we have are completely different from people of computer vision backgrounds and people who do like [inaudible 00:14:51]. Also difference between we work with big enterprise company teams, we work with very small teams and the way they think, the way they problem solve, what they care about is just completely different. I think that those distinctions are often ignored. This computer vision versus not computer vision is a great example. The way you do computer vision, the problems to be solved, they're just so different.
[00:15:16.170] - Simba Khadder
In modern pure vision, feature engineering isn't as much of a thing as it used to be. Whereas if you look at tabular data or recommender systems, it's all feature engineering. That's where all our value came from. I think that's something that's really, really crucial. When you think of your repo, do you create that distinction or do you think about the distinction or do you think that the blueprint you have works across computer vision recommender systems, et cetera?
[00:15:39.270] - Jacopo Tagliabue
The point about the repo is like, hey, this works for us and this is a use case that you can use. There are two use cases there. There's a bunch of other new open source repo, for example, one just for recommendation. There's a bunch of other repos that are tackling different use cases. There are a variation on the original blueprint. Probably they're better for the iteration.
[00:15:56.890] - Jacopo Tagliabue
But the idea is that the original one ships with two problems. One is a classification problem. I give you a sequence of event, you need to tell me if it's going to be a conversion event or not. Imagine this a shopping example. So people are browsing on your web website. You need to guess, is this person going to buy at the end or not? And that's one problem. And the other problem is actually sequential recommendation. So it's like, the user's interacting with third products, where it's like YouTube like next or something like that.
[00:16:22.380] - Jacopo Tagliabue
The thing that we say in the repo is, this is going to likely be the end of your story MLOps is. We're not pretending to end it all or to give you on a silver plate, your company. But this is probably a good place to start. If you have a problem that resembles these two problems or is actually one of these two problem, the pieces that we put in place for you to start are going to be okay. If you start with this, they're tested, they're okay, you can try them on, they're all open source. We actually open source all the data.
[00:16:48.950] - Jacopo Tagliabue
So if you want to run the exact project that we did, we open source millions of event of shoppers that you can actually use to get the feeling about these scales, a different level of that load, for example. And once you're happy with that, you can make all the changes you want. We think of the repo as the place to start, not the place to end your journey. But at least you start instead of going on reading towards data science for two months of reading all the articles about MLOps. Start with these five tools that work well together and we show you how they're combined. And then if you're unhappy with one of them, you can always swap it on and off after you understand the logic.
[00:17:20.000] - Jacopo Tagliabue
Because again, the important thing is understanding the fundamental functional pieces and how they relate together. And then you can make all the personal choices you want about tool A or tool B, once you understand how the flow works.
[00:17:30.960] - Simba Khadder
Is there something you built at Tooso that there isn't a good solution for yet, that you'd have to rebuild? Would all of it be replaced by modern infrastructure?
[00:17:38.900] - Jacopo Tagliabue
Most of the thing that Tooso was building and Coveo as well, in the sense of VIMFRA and development, I think they are like 95% percent solved by tools that you have. You need to use them properly again. The other mistake we don't need to go. So one mistake is thinking that you have to build everything yourself. That's false. But the other mistake is that, well, now you have tools and tools are going to do your job. That's false. Tools are going to make you more productive. Tools are going to make you achieve more for less input. But you still need to know how to use the tools.
[00:18:04.130] - Jacopo Tagliabue
Snowflake is an amazing tool, but if you don't have good practice on how you store the data and now you replay the data, for example, you're going to lose a lot of properties that Snowflake give you. Metaflow is awesome, but if you don't configure it properly to work with your cloud, it's not going to be the speed up that you need. So there's still a lot of things that you need to do and there's a 5%, you know what I mean? That's a 5% that is left for us to do. But at least you don't have to build or maintain the underlying infra layer, which is not what the company is about.
[00:18:31.550] - Jacopo Tagliabue
My client doesn't care about the fact that I use AWS batch or Snowflake or whatever I'm using. They care about the quality of my recommendation. And so I want to focus, my team to focus on the quality of my recommendation, not on the underlying compute, for example, or storage. That's a key point here. But I still have to do a part, yeah.
[00:18:49.690] - Simba Khadder
I'm curious, when I talk to the SaaS teams, most of the time for using pandas or something data frame related, or using notebooks, a lot of the work is experimentation. And they treat the deployment as a server issue really cool. I experiment, I do all that stuff, and then finally I switch my hat into deploy mode and I go into deployment. And it seems like I guess the workflow you've mentioned, or at least the tools you mentioned, like Metaflow and Snowflake and all of those tooling, they're more oriented towards deployment and less experimentation. One, do you buy that? Do you feel like there's this distinction?
[00:19:25.120] - Jacopo Tagliabue
My friend Ville, which is one of the creator of Metaflow, said one of the smartest thing that everybody's ever said about MLOps, which is production ready, is a continuum. Of course, that comes from his life on Netflix, which is a very advanced company that understands this very well and makes people productive in that sense. But you need to understand this to be a good ML practitioner.
[00:19:44.740] - Jacopo Tagliabue
Production ready is a continuum. There's not such a thing as experimental and production ready. There's a thing of something that works and it's okay to run on 5% of my traffic or even to just 1% of my traffic for a day, collect manually the results and then iterating going there. And then that's something that is like, well, or 30% of our user for an entire week and see how that goes. And then there's a full fledged thing of, hey, now let's serve all Netflix.
[00:20:11.460] - Jacopo Tagliabue
The good tools are the one that scales automatically to this and Metaflow is a very good example. Metaflow is good for local experimentation and with one click you can deploy on the cloud. Now it runs every day if you want to run it at any scale you want up to a certain limit.
[00:20:26.290] - Jacopo Tagliabue
So it's the same tool that stays with you from the notebook phase to an endpoint running in production. But the crucial aspect here is you have to train your team to think about the constant iteration, not about two separate phases. Because at the end of the day, there's no judge of an ML model like production. People need to test in production.
[00:20:45.920] - Jacopo Tagliabue
There's a meme when there's the normal distribution and there's people to the left, there's people to the right of the normal distribution. So the people not super smart and the people that are super smart. The people to the left and the people to the right are going to say, "Testing production", and the people in the middle are going to be the normal people saying, "No, you have to test on an offline set." That's all good and well. But at the end of the day, real people testing production. That's a super important thing culturally that needs to happen in your company. And the shorter the time from your idea to production, with all the guideline and safeguards, of course, that they're needed, but the shorter the time, the better ML company you will be.
[00:21:20.670] - Jacopo Tagliabue
That's a secret to good ML teams. ML teams can do end-to-end from their laptop to production without asking anybody. Without asking DevOps, without asking lawyers, without asking security. And that's why they're a good ML team, because in the time when everybody else is waiting for a DevOps person to deploy infrastructure, they're going to have iterated it three times in their model. So that's for me is a key. I don't buy the distinction between laptop and cloud. It's one thing for me. It just a set of degree of the amount of guidelines that you need to put in place.
[00:21:48.030] - Simba Khadder
I 100% agree. It's something that we think a lot about with our product. I think it's interesting because I actually haven't had anyone say that before. It's something that I've come to believe it's just that these phases need to be combined. A lot of people are like, yeah, notebooks are great for experimentation but they're not good for deployment. And that's true. Or I think it's true. But it's not a black and white thing. There needs to be a way to naturally transform what you're doing with the notebook to deployment. Like in our case, in the notebook, you can push features and training sets.
[00:22:17.070] - Simba Khadder
You make me think of interesting points. I love the example of pushing a production. It's something we actually built too. I'm not surprised Netflix does. Netflix is obviously deployments like Chaos Monkey, which comes from the same, I guess philosophy of testing for production. We used to do the same thing with our recommended system. Every time we'd have a model that we thought would work, we would hit a button and it would immediately go to 1% of the audience and it would grow and shrink according to things. We'd always make sure we always at least had two because we were afraid to do one and accidentally pigeonhole our recommendations too far.
[00:22:48.290] - Simba Khadder
So we had all these things we built. It's not something I've ever seen mentioned. Sometimes I'm like, "I can't believe our own people who built this." But I've never seen laws talk about this thing, and it was a key piece of our infrastructure.
[00:22:58.600] - Simba Khadder
The last thing I'll add, someone told me once that the greatest adversarial network you will ever find is the Internet. If you put a recommended system out, you can bet that there's a whole very smart, very sophisticated team of humans, decentralizingly trying to break it, like Google search. There's always a billion people in the world trying to game the system. So you have to deal with the most smart adversial network you can create. When you talk about the degrees of ready for prod, is that what you guys do today? Do you do things that try out different users? Does it apply outside of recommended systems? How do you think about that? Or I would love for you to expand on it.
[00:23:37.410] - Jacopo Tagliabue
You interrupt the general principles. Then of course, what your company does. The crucial difference between B2C companies like Netflix and B2B companies like [inaudible 00:23:45] or [inaudible 00:23:46] whatever is that you serve different customers that are shops, but you don't control anything that happens between that shop and the final user of the shoppers. So what you build is a recommender system is going to be used by users. But user, they're not your user. These are your customer user. Just subtle distinction.
[00:24:03.720] - Jacopo Tagliabue
So it means that there's a limit to things that you can do both in the fact that you don't control the final UI because of course, you just provide an API as a SaaS platform and then the company does whatever it wants with it. That's why they pay you. And the second thing is that there's a limit, both contractual but also more philosophical, like ethical limit. The number of experiments you can run when there is the complement between you and the final shoppers.
[00:24:26.180] - Jacopo Tagliabue
At some point, you need to take that idea of production ready is a continuum of-fine Ville, thank you very much for the tagline-and then make it work for your own company.
[00:24:34.140] - Jacopo Tagliabue
But I think especially in the last couple of years, the MLOps team that I worked with at Coveo, incredible, very significant step forward toward realizing that vision. Part of that was culture, of course; being good to adopt this new way of adopting stuff, of course. And part of that was MLOps tool that would make this something that can happen in months and not in years. Even large companies like in a thousand-people public company, thanks to the metaphor of this word, you can think of making this culture shift at a relatively low price so that people are also more encouraged to see a new way of doing things. You know what I mean?
[00:25:08.250] - Jacopo Tagliabue
If you're going to say to people, "Hey, this is a fantastic new way of doing stuff, but now we have to go to the Mount Everest to achieve this", it's going to take a lot of convincing, a lot of things to go around. But if it's like, "Hey, but you can just start with these open source stuff. We can start with one model, but we need to change our entire thing. Let's start with one specific use cases, import it to this new way of doing things, and then we enlarge from that." But you need people to get behind you and build it. So, again, demand and supply. The companies are more ready to have sophisticated tools in place. And then supply. Honestly, there are very good tools now that actually make these choices even a no brainer compared to three years ago.
[00:25:42.860] - Simba Khadder
What are you most excited about in MLOps?
[00:25:45.340] - Jacopo Tagliabue
There's a bunch of things that I think are still open question and I would love to see more companies tackle this. So to my topics that I really like and I'm excited to see what the community is going to do in the next couple of years. One is testing, which of course is something that I have also a vested interest in as the quarter of list and the [inaudible 00:26:04] challenge and so on and so forth.
[00:26:06.600] - Jacopo Tagliabue
I believe we really suck at testing recommender system, but more general testing ML systems. And the idea that your ML system is defined by a one metric, MLR and the CG on an L doubt test set, is a profoundly misleading idea. So the idea that our generalization capability, the idea that we can predict what the model is going to do in the wild with this number is really not doing a good service to the field. Of course, again, I also contributed as an open source and a scholar to this problem, and then we'll continue to do so with the team. But I would love for this to become one of the central actual discussion about ML.
[00:26:44.600] - Jacopo Tagliabue
As more people doing ML, the barrier to entry gets lower. More models are going to be in production changing the world and changing user behavior. So testing is going to be more important, not less. And I feel there's nearly not enough discussion. There's still a lot of discussion about models or about GPUs, but not enough discussion on how to test properly your model. So that's topic number one, which I really like to see what the community is going to go.
[00:27:07.710] - Jacopo Tagliabue
And topic number two, which you mentioned, is experimental platform. As you said, it's impossible that people don't talk more about this fact that you have to have two models always competing with each other and so on. But the truth of the matter, there's no open source or there's not even a real experimental platform out there for people doing ML. There's a optimizing tool for people changing the color of a button in a UI. There's tons of options for that. But there's no experimental platform like the Stitch Fix or the Netflix of this world. It is designed for people sending complex models into production and understanding how this model behave.
[00:27:42.440] - Jacopo Tagliabue
So somehow an experimental factor for the backend, not for the UX, I would love for somebody to build a company in that space or at least to build a framework at least to raise awareness of how important that is for the community. Because building one yourself and you know it, because you did, it's a lot of work. You would totally offload to somebody else if it was possible.
[00:27:59.830] - Simba Khadder
I wonder if those two things could be combined, actually. Testing and experimentation, it almost feels like there's something there. Anyway, I love that. I think that's very interesting. And for someone listening who's motivated to own an MLOps company and thinking about what I did to build, there's two great ones. I thought we'd keep talking forever. But I do want to put a bow on this. Maybe last thing. If you have to give a tweet leaf takeaway, almost a TL;DR for someone who listened to this and is like, "This is amazing, I need to go tell my team about this." What should they say?
[00:28:29.840] - Jacopo Tagliabue
My suggestion is always the same is, A, it's a fantastic moment to be machine learning because you don't have to do much. So first point is you think this get started with his own machine learning thing, it's super hard. It's not trivial, I granted you, but it's not nearly as complex as five years ago. So this a fantastic moment for people to get started. And there's a lot of materials that you can start from; yours, mine, for example, but there's tons of other fantastic material.
[00:28:54.900] - Jacopo Tagliabue
The second point that I would say the second tweet is, start with the problem you have and the stack cover now, not the one you wish you had. I think a lot of people get overwhelmed by the idea. Oh, wow, but this one's scaled when we were Facebook. I'm like, "Maybe, but you're not Facebook now. And if you're ever going to get there, it's an epic problem we're thinking about." Think about your recommender system now or now in six months, now in a year. Don't try to overengineer this entire thing for a future that may never materialize.
[00:29:24.630] - Jacopo Tagliabue
Metaflow is always the same. I see a lot of people in this space that are trying to start playing tennis. And the only thing they consume is Roger Federer training which is amazing. Roger Federer training is very inspiring, but it doesn't have any bearing with learning how to do a frontend and a backend right now. Right now you need a frontend, a backend, and a Volley that are solid. Chances are you're never going to become Roger Federer. And even if you become a professional, there's going to be some year from now in time. Thanks, Roger, for all the amazing things that you did, but for me, right now as tennis player, I just need to eat a hundred frontend, and get them on the other side. So focus on that. Then if you're a Roger Federer, we'll figure that out when time comes.
[00:30:00.400] - Simba Khadder
To expand on that, I bet you if you think, would you know what it takes when you're him to do it? That you'd look back in five years if you make it that far and be like "Wow, do you believe that we thought that was what we're going to do?" Even if you do get there, you're probably wrong. So why waste your time thinking about that and just focus at the problem end? I love that. That's an awesome analogy. Jacopo, it's been so great chatting with you. Thanks for hoping on.
[00:30:23.670] - Jacopo Tagliabue
Oh, thanks so much for having me and thanks for everybody for listening.
From overviews to niche applications and everything in between, explore current discussion and commentary on feature management.