In this week's episode, we sit down with Leigh Marie Braswell, Principal at Founders Fund, to discuss her experience as a PM at Scale AI, her transition from an MLOps practitioner to an investor, and her thoughts on the future of MLOps.
Simba: Hi, I'm Simba Khadder. And you're listening to the MLOps weekly podcast. Today, I'm talking with Leigh Marie Braswell, an investor at Founders Fund. Before joining Founders Fund she was an early engineer in the first product manager at Scale AI, where she originally built and laid led product development for the LIDAR and 3D Annotation Products, which is used by many autonomous vehicles, robots, and AR VR companies today. She also has done Software Development at Blend, Machine Learning at Google and Quantum Trading at Jane street. Leigh Marie, it's great to have you on the show today.
Leigh Marie: Thanks for having me. I'm excited to chat.
Simba: So we like to start with kind of a vague question. MLOps is a new space, and we're always interested to know what our speakers…How would they define MLOps today. What does MLOps mean to you?
Leigh Marie: Yeah, it's definitely there isn't consensus definition for ML Ops. If I had to summarize what I think of when I hear of quote unquote, an ML Ops company or someone who needs help with ML Ops, it's really all about how do we take all of this research that has been being done in machine learning, these cutting edge results and computer vision or NLP, and how do we get into production? How do we actually make a self-driving car? How do we actually make a recommendation system that works within the constraints of the real world? And so that's not just model performance in a box. It's how does the model interact with edge cases? How does the model interact with users in a timely way, all these sorts of questions that you need to answer for MLOps work in production.
So I kind of think of it, broadly I'll say ML Ops tools fall into helping with different parts of the machine learning life cycle. So whether it's data collection, or training, testing, debugging, deployment, monitoring, MLS tools usually are focused on one of those areas or maybe span a few of them and they might interact with multiple user personas. So the ML engineer, obviously, but then some of them are also data tools. How do you make sure that the data going into your ML model is very clean? Some of you're interacting with a data engineer, or maybe you are helping facilitate communication around part of the ML model life cycle. So you're working with a data analyst or with a PM or something like that, but yeah, it's kind of all about how do we really productionize all this amazing work that's been being done in Machine Learning research and have the results make our lives better.
Simba: Yeah, and it feels like MLOps has had a few different phases. It feels like MLOps…And we've been putting ML on production for a long time at some companies. And you've seen it from when you were at Scale and now you're seeing it at Founders Fund and MLOps is raising a prominence as a term. How has MLOps changed in your time from Scale to now at Founders Fund?
Leigh Marie: Well, I guess come to be fair for Founders Fund in particular, I've only been here for about nine months, so it hasn't changed way too much. I have learned quite a lot more because now I view it from a different sort of lens as an investor versus at Scale. And so internally I was helping manage machine learning model usage internally. And then I was working with lots of customers that obviously have a lot of ML Ops problems outside of just labeling, which is the primary area that Scale is helping them at the time. But yeah, I mean, definitely I think a few things have changed from a macro perspective. There's been a ton of consolidation in autonomous vehicles, for instance. So I think people are realizing and more proof points that it's going to work. So we're going to have autonomous vehicles, autonomous delivery robots, some cities already do.
So that's been an interesting aspect of the ecosystem. There's been a lot more point solution MLOps companies that have popped up, I would say in the last year, there's just a lot of options. Now, if I say, I want to just do model monitoring, there's a ton of options of startups that I could pick from as somebody that's evaluating things. And I mean, it's great in that we are sort of democratizing access outside of ML, as you said, ML is being done in production for quite a while at like for example, Google, but that's because Google can afford to hire a ton of really great ML engineers and other types of engineers and build sort of all the infrastructure you might need internally. But now you've seen more and more companies coming out of Google or coming out of Uber or coming out of Facebook that take part of the ML for there and maybe are inspired by, you know, pain points they encountered, they're building internal tools and now they're bringing it out for people to use.
But what's really interesting is it sounds, and you see the same pattern in dev tools and data tooling and security, but what's really interesting is fundamentally ML engineers at different companies have a lot of different problems. And so like it's something that helps with, you know, machine learning management and production at Google, always going to translate to something that's going to help with machine learning management at a small startup or at a fortune 500 company? I'd say the answer is, no. And so it's interesting seeing these startups like navigate their way to product market fit with a set of very different customers in terms of where they are at in their machine learning journeys.
Simba: Got it, and one thing I want to maybe explore a little further in that is do you think the main problems to be solved have changed over the last few years of MLOps? Or is it more of kind of an education thing?
Leigh Marie: Yeah, I think part of it is just like people are doing a lot more with machine learning than they did a year or two ago. And there's just kind of more interest to model use ML versus using other types of data analysis or automation. So I think the pain points it's interesting because I think we're just seeing so much evolution in like the skill set and the tool set that you use as an ML engineer to do anything. And there's just, I always say this if you ask like 10 or 20 engineers at distinct ML startups, what are you using in your like ML stack? I can almost guarantee you would just get like this wide variety of answers. And also like you would get people that say, I've literally switched from this to this, to this because that's what we did at Scale. We set up some stuff, we coded stuff internally, we switched…I don't think that there's as much standard around that as, okay, well I want to set up a data warehouse, so what am I going to pick? And you're probably not going to switch immediately out of that once you set it up.
So it's interesting. I think that part of it's the tools with the best developer experience are going to win. And right now I think we're still seeing, what is the best sort of ML engineer experience and are the people that you're selling to at all these different companies going to agree on something being the best experience, just because they're working on wildly different problems and NLP and computer vision and analyzing other types of data? Maybe they shouldn't even be using ML in the first place, that's something that I see a lot or maybe they should. And they just don't know how so, you know, maybe it's a software engineer at an early stage company that just doesn't know how to get started or data analyst at a company where all the ML engineers are working on something else. Yeah, I think there's a lot of possibilities in the space, but figuring out what the right sort of go to market motion is very challenging. I as an investor, I can give data points of things that I've seen at different companies, but I don't have sort of, well, if I saw a company that was executing on the XYZ strategy, I would know that it's the right strategy. I have no idea.
Simba: We talked a bit about the different…You mentioned that startups, ML stacks look very different from like Google stacks. And we talked about autonomous vehicles and you know, lots of new use cases just coming around FinTech and B to C and you saw the autonomous vehicle use case in some robot use cases, very hands-on directly at Scale. Are the MLOps problems the same, and should the stacks be different? Like will there be verticalized MLOps stacks? How do you think about that?
Leigh Marie: The easiest part of the question answer is will the stacks always be different AV and Robotics, absolutely; there's just so many different…The amount of sensor data that you need to ingest and act upon is just wildly different. Like you're getting 3d data, 2d data, you're getting timestamp data like in these big, big tensor logs. And it's not just data from the, the perception of the world by the robot. It's also in many cases mapping data from like Google mass version of the road that the robot needs to know, especially in AV to like follow traffic laws and then, and to kind of navigate where it is relative to its targets, like if it were delivering something. And then once, you know what the world is around you and then where you're going, you need to figure out how to get there. And that's a whole different part of stack planning while, if you're an ML engineer that's not working on a robotics or AV problem, you don't have to worry about a lot of these concerns. And as a result, you've actually seen startups with great success doing vertical specific infrastructure.
So for example, Scale; Scale’s bread and butter was AV Image Labeling and then expanding to LIDAR labeling. And you know, you've seen other companies, like I see now more and more robotics infer companies coming up that are specialized in, how do you ingest all of this data effectively? And can I create standards around the format of the data and the visualization of the data because all robotics companies encounter these same problems? So I think that'll continue to happen in those areas. I think what's more interesting is for general ML, whether it's ML operating on tabular data, just predicting user churn or predicting forecasting or predicting like these sort of very common problems with very sort of basic data, whether it's text numbers. I think that's a little bit more interesting, like potentially we will converge on, okay, this is if I'm a engineer with this level of sophistication and I want to solve this problem, here's how I use ML most effectively to solve this problem. I could totally buy that.
And maybe it's even one tool where if I'm not as technical, the tool will do a lot more for me. If I'm very technical, I have the ability to like override defaults and I'm able to go in and essentially use it like a notebook or Tensor Flow or something. I think if I were looking at a company that had a lot of AV customers and they didn't at least have a vertical AV focus for a little while, it'd be very skeptical because I think some of the best advice starting a company, especially if it's as specialized as AV customers that you're targeting, be the best in that one area that you're solving the pain points and then expand outwards. And that's what we've seen Scale do.
Simba: Got it. And so it sounds like verticalization of ML Ops seemingly will be a thing from your perspective. And how do you think about…There's almost verticalization on problems like where it’s kind of best in class point solutions, for these ML platforms. Do you think that both will exist? Do you think there's one that's better? How do you think about the space between point solutions and ML platforms?
Leigh Marie: That's a great question. And it's funny because you talk to any company and I mean, nobody wants to describe themselves like a point solution. So every company's a platform and I mean, I don't blame them. I think it's you realize in your sort of grand vision, you need to kind of inspire people around a platform at least for a particular like user persona or a particular vertical. So I'm not sure. I think eventually in the very long term we'll see more and more platforms and less point solutions. I guess I don't have a super strong opinion on when that's going to happen. And then also if you're a point solution, but you're solving a hair on fire problem that people are willing to pay a lot of money for to solve, then you're probably not going to go anywhere unless somehow you're competitors who are also offering other things that are very entrenched within customers, figure out how to duplicate what you're offering or maybe figure out like a clever pricing strategy or a strategy to get you out of an organization.
So you definitely have a leg up if you're a point solution and it's something that's just so essential and it almost behooves you at that point to, okay, well how do I continue to like grow with this company or figure out a different, less nichey area for me to focus on. So I'm really not sure. In terms of the timelines for like when you will see mostly platforms and fearpoint solutions, but right now there's definitely a lot of really sort of exciting point solution work that's being done. And I'm eager to see how these companies strategically expand outwards.
Simba: What excites you most about being an investor in now mal space?
Leigh Marie: There's a lot of stuff that excites me. The reason that I joined Scale out of college was, I mean, I think that machine learning in general is just this beautiful marriage of math and programming. And it's done in such a way that the applications or many of the applications that ML all leads to are just like you couldn't do it any other way and you can't always explain exactly what what's happening. So I think that that's just so cool and so interesting. And so that's kind of the reason I wanted to join Scale and then, spend four years there working with many companies that are doing deep learning in production. And so, now as an investor, I get to focus part of my day on mapping out all the companies, making platforms or point solutions or whatever that either help ML engineers do their jobs or are ML applications themselves; so better ways of doing anything than sort of more efficiently or with higher accuracy. And it's such a rapidly changing space as well as we've kind of talked about prior to this, there's just so much going on. And so many standards are still being built. So yeah, it's been great. I am excited to continue learning more.
Simba: So you said that the space, we both can definitely agree with the space is changing very rapidly.
Leigh Marie: Yeah.
Simba: And because of that, it feels like there's a lot of confusion, there's also a lot of misconceptions. What do you think is something that many people get wrong about MLOps?
Leigh Marie: That is a good question. Well, it depends on, I guess who you ask. I think if you are not a practitioner, if you're not a person doing ML day to day, coming in, maybe you want to map it to an area that is a bit more figured out like data infrastructure. If you're looking at that ML infrastructure in particular, and it's just not nearly as mature as like an ML infrastructure analog or like dev tools analog. I guess people don't have consensus opinions on a lot of really basic stuff. And so a lot more than you would think is done internally, even when maybe sometimes it shouldn't be. I've seen companies make their own experiment tracking system and collaboration system, even though they knew about weights and biases. It's almost like when you have a very small ML engineering team and they don't exactly know in the future how many resources will the company devote to us or what are our biggest priorities, which can be like more organizational challenges too. Like, do I buy this external solution? Do I make it internally? Do I want to get locked in to doing things a certain way when I am still sort of forming the basic tenets of my team? I think it's just basically very early. I think it's earlier than people think it is when it comes to standardization around infrastructure.
And so like figuring out what I'm looking for when I'm looking at startups to potentially invest in, has this startup figured out, proved out a pain point and a user persona that has that pain point that will nine times out of ten buy what I'm offering versus like, oh, well maybe I can do this internally, or I'm not exactly sure what my needs are here yet. And maybe that's just something that comes with time. And as you have more mature companies that have larger ML teams, it's like, okay, well, we actually do have these large pain points and we'd rather not spend a team of five engineers, like figuring out how to code a experiment tracking system. We'd rather just go with our standard experiment tracking platform X. So that's what I've been most surprised about is that they're just really not standards even where you think that there could be or should be, it's challenging to be an investor in the space and a reason why, like, even though I know the space quite well from my time at Scale, I haven't made that many investments, yet it's challenging.
Simba: And so, you've seen a lot of different categories kind of play out, like you said, it's early days and it's not very clear to a lot of people who are probably even listening to this. Like where do I start? What should I do first? And how would you answer that?
Leigh Marie: I guess people listening to this that are ML engineers themselves, or that are like founders of ML-
Simba: More ML engineers. If I'm looking at spring ML to my company, is there specific based on like what you're seeing on the other side, looking at all different categories play out, are there specific categories that you think or specific problem spaces that you think are the most bang for your buck or bang for your time?
Leigh Marie: Yeah, I guess the ones that would be kind of the most consensus, obviously my…I guess I can say categories versus actual companies, because I feel like that's less biased. So I think labeling, there's been obviously a ton of activity there the past few years it's like labeling and even now synthetic data. So I think you have multiple options if you are an ML engineer that wants to label your own data and then train a model on that label data versus getting something pre-trained off the shelf. I think in terms of places where you can find pre-trained models, if you have like a somewhat high degree of technical sophistication, I think there are places for that, that people kind of all know about. I mean, for example, you know Hugging Face in particular, if you want to find any NLP model it's probably on there and you can probably with a bit of ML engineering expertise, depending on how complicated the model is, get it to work.
And then let's see, I would say with experiment tracking and collaboration, I wouldn't say it's like super consensus right now, but I mean you have tools that like are definitely well known and being used inside many companies and we're still very early days in terms of monitoring in terms of testing, in terms of active learning, in terms of hardware optimization. But I do think loads of pain points inside lots of different types of companies where that's happening. So yeah, I would say TLDR getting started either you’ve got to go through it. If you're trying to label your own data, you're probably going to have to combine some point solutions. Otherwise, there are these platforms, and more by the day, where you can get on and download the latest models or maybe you even use like a cloud provider's; like end to end machine learning platform to just get something spun up and see how well it's performing and then have a baseline for when you want to do your own more custom stuff. I've heard of people doing that.
Simba: Got it. So labeling being just your data working on that side, finding tools that make that part of life easier, as well as just pre-train models in general, just like kind of get you a lot of the way there, if you can find ways to productize them.
Leigh Marie: Yeah, if you're coming at it from like a very standard, an OCR model for general data or something like that, you can probably find pretty quickly. But yeah, more custom stuff, then you have to start building your point solution stack, which is challenging, but something that most teams of a few engineers larger than two or three ML engineers have to do.
Simba: Got it. And so you mentioned SageMaker and some of the cloud platforms doing ML and obviously you see a lot of startups that and point solutions and platforms are also kind of playing in that space, usually in smaller sections when SageMaker has something for everything. How do you think about SageMaker and Vertex? When should you use it? When should you not? Do you have any opinions about the long term future of the cloud platform, ML platforms versus lot of the open source and proprietary startups in the space?
Leigh Marie: I think it's a great idea for everyone, whether you're an ML engineer or you're trying to make an MLOps startup or whatever, to take a close look at what they're doing because in terms of offerings and in terms of marketing and in terms of distribution SageMaker is very, very, very good. And so for somebody to use MLOps startup over a part of SageMaker, there needs to be a clear value proposition; maybe it's as simple as a lot of your users don't want to be tied into a particular cloud. And so they'd rather use something that is cloud agnostic, or maybe it's something as simple as I'd rather use something open source, because I feel like it's more transparent in which case you have a clear differentiation. I think if you're a SAS tool and potentially a SAS tool that only runs on some clouds or maybe they don't really care about what cloud they're on, it's harder, then your product needs to be better. And it needs to be better in such a way that like it's convincing for someone who is maybe using SageMaker as a platform to then move over to like your product, plus some other parts of other products and not sort of the whole SageMaker stack.
So I think if, you know, just getting started with ML and you don't really care about the values of the open source community, or you don't care about being tied into a cloud; I've heard of people bootstrapping their ML journeys on SageMaker and at some point potentially moving off to, maybe for example, with Scale. I can talk about early days of Scale, people would use MTurk and it would perform fine. But if you wanted to do anything at high volume or very high accuracy requirements, or if you wanted to stop managing your labeling instructions or your team doing your labeling or quality of the labeling, then you want to move to Scale because Scale is 10 X better in that regard. So yeah, I think it's something that every company should be cognizant of and every ML engineer should check it out to see how it works. But yeah, I always say, be aware of your competition, but ultimately there are clear value props most of the time for companies, you know, not buying everything from the same vendor.
Simba: Totally. And from AWS's case, like they're such a big surface area that like the Scale analogy of, if you focus on one thing, especially with MLOps where a lot of the specific problems to be solved can be so complex or so specific for different industries and different verticals. It provides opportunity for people to build something that's maybe a little more tailored and perhaps a better specific solution as opposed to kind of general thing. Is that…Do you buy that?
Leigh Marie: Yeah, I totally buy it. I mean, I loved it when I joined Scale, everybody was like, you're silly because Amazon is exactly what you're doing. So like good luck that's labeling is a solved problem because you can go to MTurk and you can get things labeled and that just was so not true. You can get some things labeled, but if you want, you know, AV images and 99% quality, according to whatever you think that is, and you don't want to manage the labelers or the instructions or whatever, that just wasn't going to happen on MTurk. And I think so if you looked at it from face value, it was silly going to Scale, but if you understood what was actually going on and the product differentiation, then there's like a clear argument you can make for the labeling space evolving over those years.
Simba: Yeah, definitely. Like I said, these problems can get so, I mean, labeling in particular is one of those things where there's so much can go wrong and there's so much that can be optimized and from UX to literally using machine learning and otherwise that it makes complete sense that point solution there, or even if the platform, I guess, is where's point solution, where is the [inaudible 23:14]?
Leigh Marie: Yeah, it's not a point solution anymore. Now there's, I guess, clear other areas that Scales gone into, so it's definitely on its way to becoming a platform.
Simba: Slightly different question, I guess now you were in the early days Scale, you built in led a law of product fair. Now you're in the early days of being at Founders Fund. I guess if you look back, what do you know now, that you wish you knew when you started working in MLOps?
Leigh Marie: I think we've touched on a little bit of this, but I definitely wish I understood that because the space is so rapidly changing it does not take a lot to start. Like I hear for every day, like; “Oh, I'm too late to ML. It's already happened.” All these people already know so much and I'll never be able to learn everything if I want to get involved as an engineer, or a PM, or starting my own company or whatever. I just don't think that that's true. I think, yes, there's good value in like understanding historically what the field has gone through, and then obviously like if you want to get hyper-technical doing your own original research, that's obviously a very hard challenge. But yeah, I think I wish I had just known that I was sort of, sometimes you get an imposter syndrome when you're first starting out, because you don't have a huge background in the space. I've taken a few classes in college and you know, done an internship. And then now I was helping build an ML product.
But I think I, yeah, I just wish I'd known that it's totally doable. And if people are helpful, if you're curious, asking questions and there's so many resources online now, and communities online that you can use to kind of help get you up to speed. And I mean, it's such a friendly community. I love being part of the community because if I'm ever curious about a company or a field or whatever, people are usually very willing to kind of help me out and help me understand things. But yeah, I definitely wish I'd known that I think market sizes in general, people underestimate, but this is true across all dev tools. And I think it's actually being corrected, just like thinking that the term for autonomous vehicles five years ago was some or AV labeling was like something that anyone had a good idea of. It's just faults. Nobody expected it to be as big as it is now, I mean, I didn't. And then I guess another thing I wish I'd known is a lot of it is about your go-to market and your strategy if you're sort of building an MLOps company versus having the best tech. I think just realizing I came out of college and joined Scale and did not have a huge startup operator background, but realizing how much of building a business, even if it is an MLOps business is still like building a traditional business; and so, making sure that you do actually have a model and that opinion on how to get distribution and things like that.
Simba: It is all of what you mentioned about kind of that tool space and just the general size of it. It's crazy looking around now. Probably if you were to go back just thinking about how niche something seem and how big that opportunity, like you mentioned AV labeling, just the idea of how massive, an opportunity that is alone shows how big the opportunity is for dev tools and I guess MLOps tools more broadly in AWS and Google Cloud. You know, it's like even at that scale of company committing hard to dev tools and cloud is just still a massive opportunity, but we're still just starting to see how big and that will be in the long term.
Leigh Marie: Absolutely. It’s one of the things that excites me most about being an investor in ML. And then it's also interesting, you look at something like Crypto or Web3, which is something that I've been looking into a lot recently, I saw a statistic that it was like 0.2% of the users of the internet use Ethereum on a monthly basis. And so it's just like, wow, even all this investor, excitement, maybe we're just dramatically underestimating the term there too. So I think that's the reason I've been seeing a lot of dev tools people get into Web3 recently, which is why I have read it up up. But yeah, I think it's maybe a similar vibe in that you have these very, very early adopters and you just don't realize how much more adoption there's going to happen, how much more infrastructure you need and sort of like the second and third order effects of a lot of this. It's definitely a very exciting world that we live in exciting space to be in.
Simba: Yeah. And last question for me, I mean, for people listening who are trying to learn more about this space, what resources do you find yourself…Like, should they be looking at? What resources you spend most of your time on and learn from the most?
Leigh Marie: I have a lot. I try to post them on my Twitter too, which is LM_Braswell, if you're curious. But I guess one thing that helped me was going to conferences that was really formative, when I talk about the ML community and how helpful they've been, going to conferences and meeting other ML engineers, it's super valuable. CBPR, ICML, Europes, you go and you there's usually different levels, talks and so that's always super fun. There's a lot of ML founders and researchers that I call ML influencers basically, that I think you can follow for example, on Twitter and learn a lot. So I have a list on my Twitter of people like that. There's a lot of great newsletters; Eye on AI, MLOps Community, Deep Learning.AI, MIT Tech Review, State of AI Report, all of those are great. And yeah, just following startups that you think are interesting like yours and like the content that they produce, definitely a way to kind of stay up to date with the space because it's rapidly evolving, which is exciting, but then it's also like if you aren't staying up to date, things in six months could look very different. There could be a lot of really important companies that were founded or grew in the six months, and you're like, who is this company? So it's helpful. A lot of my work that I would do to stay abreast of ML developments at Scale now translates over to being an ML in front investor, which is very helpful. And also I'm always eager to, if you're ML engineer and you have a question about Ops, well what should I use or what startups are in the space, I am totally, always happy to chat.
Simba: That's awesome. Thanks Leigh Marie for hopping on with me today. It was a ton of great insights there, we covered a ton of different topics, which was really cool to get your perspective on.
Leigh Marie: Yeah, it was a fun conversation. I always love to chat. And if you disagree strongly with anything that I said, I'm also I would love to hear about that. But yeah, I think it's so fun to be in the space and I enjoy chatting
Simba: Awesome. Thanks again, and we'll include your Twitter on the description so people can come and look at some of the things you posted and chat with you if they have any thoughts
Leigh Marie: Amazing.
Leigh Marie: Cool.
From overviews to niche applications and everything in between, explore current discussion and commentary on feature management.