Decoding RAG and ML Embeddings with Demetrios Brinkmann


MLOps Weekly Podcast

Decoding RAG and ML Embeddings with Demetrios Brinkmann
Founder, MLOps Community


In episode 22 of the MLOps Weekly Podcast, Simba Khadder and Demetrios Brinkmann dive deep into the world of RAG (Retrieval-Augmented Generation), embeddings, and the evolution of MLOps. Simba emphasizes the true purpose of RAG, which is to contextualize queries, and warns against blindly following the trend of embeddings. The duo also reminisces about the early days of vector databases, with a particular focus on the embedding hub. As the conversation unfolds, they touch upon the maturity and ROI of MLOps, highlighting its transformation from a buzzword to a practical solution for many enterprises.


  1. Purpose of RAG: The main goal of RAG is to contextualize queries, adding as much relevant information as possible to aid the model in its operations.
  2. Embeddings Aren't Magic: While embeddings are powerful, they aren't a one-size-fits-all solution. It's essential to understand their potential and limitations fully.
  3. The Rise and Shift of Vector Databases: While vector databases like the embedding hub were pioneers, the focus has now shifted to feature stores due to their broader potential and challenges.
  4. MLOps Maturity: MLOps has evolved from being just a buzzword to a solution that companies are willing to invest in, thanks to a clearer understanding of its benefits and ROI.

Click here to join the MLOps Community!

Listen on Spotify!


[00:00:04.470] - Simba Khadder

Hey, everyone. This is Simba Khadder, and you're listeningto the MLOps Weekly podcast. Thisweek, I'm chatting with Demetrios Brinkmann. Demetrios is the creator of theMLOps Community, the largest MLOps community that exists. If you're not a partof it, you should definitely check it out. You can go to andjoin the Slack channel there.


[00:00:24.060] - Simba Khadder

Demetrios has been a leading voice in the MLOps world sincethe early days. He has his own podcast. He talks all the time on MLOps. He isconnected and knows everyone. He's also a famous ukulele player, so it's reallyfun to be able to flip the stage on him and be able to interview him this time.Demetrios, man, it's so great to have you on the show.


[00:00:46.610] - Demetrios Brinkmann

What's going on, Simba? It's been a little bit. I think Isaw you, what, like month ago? Maybe a few weeks?


[00:00:53.540] - Simba Khadder

[crosstalk 00:00:53] Summit.


[00:00:54.010] - Demetrios Brinkmann

There we go.


[00:00:54.780] - Simba Khadder

It's probably a few weeks ago. I feel like I should beplaying the ukulele right now for an intro or something. I feel like it's[crosstalk 00:01:01] on you. We can end that way. Maybe that's how we shouldend this. Well, we have a ton to cover. There's a lot changing in the world. Wecan start by maybe you could give a state of the world. LLMs are a thing now.MLOps is still a thing. What's happening right now? What's your perspective onthe state of the world?


[00:01:19.850] - Demetrios Brinkmann

I wonder if I'm holding on to a dying breed with the MLOps.I always ask people, do I need to rebrand into LLMOps? After talking to certainpeople and hearing how different VCs feel about the MLOps market and how MLOps is,I think a lot of people got burnt by what happened there. LLMOps feels like thenew frontier. Even if you are doing MLOps, you say you're doing LLMOps so youcan get funding. I'm not sure if that exactly answers the question, but it getsto it in a bit of a roundabout way. As far as LLMs are here, they've got a lotof traction, they've got a lot of hype around them, and they're garnering a lotof attention.


[00:02:06.450] - Simba Khadder

Obviously, I've been in startups long enough. You've seenenough that we understand what you mean by a lot of people, especiallyinvestors, got burnt on MLOps, but maybe you could zoom in on that.


[00:02:17.610] - Demetrios Brinkmann

I'll try not to call anybody out, or if I do directly, wemight have to bleep it later. But I think that what you see is the use cases inthe market was projected to be X amount by 2022, 2023. Especially if you werean investor in 2018 and you were looking at the way that ML was growing and theuse cases were growing, and you made your financial projections on top of that,you valued companies at a much different rate than they've grown over the pastfive, six years. When you see how ML has grown, it's just been incredibly hard.


[00:03:02.030] - Demetrios Brinkmann

The use cases, I don't think, have bloomed and blossomed asmuch as VC's spreadsheets said they were going to yet. I don't think that it'snot going to. I just think it's been slower than what we thought. Because ofthat, you get people that are not so excited about the market and the wholeability of this market to be the rocket ship that they thought it once was.


[00:03:28.360] - Simba Khadder

There's a saying, and I forget, I think it's Turkish, but Iheard it once. I'm going to say it badly, but it always stuck with me. It'sfunny where once you burn your tongue on hot milk, you will blow on ice cream.It's like once you burn yourself, anything that feels like that again willtrigger that same reaction.


[00:03:48.440] - Simba Khadder

I guess it's interesting to think about the last two years.There was, obviously, this big boom cycle. A lot of money got deployed, and alot of people are looking for, in your own words, what is the frontier.Obviously, we have a deep look at MLOps, but if you talk to people in Fintechand a variety of other verticals, they will tell you the exact same story ofeveryone got in over their skis.


[00:04:10.970] - Simba Khadder

At that time, it felt like anyone could be a $10 billioncompany. Now when you look at the public markets and you're like, "Oh,Twilio and these huge companies that we view as behemoths are not worth $10billion, or maybe like around $10 billion." It's like you need to be likethat to justify that valuation. All of a sudden, you look at your spreadsheetsagain, you're just like, "Oh, the map doesn't work anymore." I thinkthat's what we saw.


[00:04:35.690] - Demetrios Brinkmann

It's actually much harder.


[00:04:37.190] - Simba Khadder

Yeah. There's a guy I like. His name is Jason Lemkin. Hetalks about SaaStr. The way he describes is he's like, "For those twoyears, for the first time, it felt like SaaS was kind of easy. It felt doable.And we were all reminded that, nope, it's really hard." I think that'swhat we saw. I mean, there's just no… Even in DevOps, which is something that Ithink we have seen a lot of public companies come out of.


[00:04:59.460] - Simba Khadder

If you think of how many DevOps companies existedoriginally, it was hundreds, thousands probably, and how many are actuallypublic companies that returned substantial capital, it's actually a smallpercentage. I think that's typical. But I think for some reason, MLOps, therewas this weird feeling that, "Oh yeah, there's going to be 25 public MLOpscompanies," which just was never going to happen.


[00:05:23.370] - Simba Khadder

What's your sense now? It seems like there's thisinvestment-style counteraction to MLOps. Talk about LLMOps. Does it feel likewe're repeating the same cycle there? Does it feel different? What's your senseof things?


[00:05:34.780] - Demetrios Brinkmann

Well, I think I will caveat this with, I am still all in onMLOps and I still feel that there is a lot of big open questions and a lot ofreally important things that are happening in that field. That despite it notbeing the investor's darling anymore, there's a lot happening, and there's alot of great stuff that is going on. What I do see with the LLMs is that itjust basically took the ability for someone to go from zero to one. Itamplified that and turned it up to 11, as they say.


[00:06:11.140] - Demetrios Brinkmann

All of a sudden, now you can, without knowing much about howmachine learning works, or maybe you don't know anything, you can just hit anAPI and get some AI plugged into your product. Now you have a lot of people comingat this field from different disciplines, which is, for me, it's really cool tosee because you have different minds and a lot of diversity and ways ofthinking and ways of attacking problems that are much different than if it wereonly in the hands of data scientists or data engineers and ML engineers,obviously, too.


[00:06:49.680] - Simba Khadder

One thing I found very fascinating for us is our LLMframework is Python-based because all our MLOps is Python-based. The percentageof people using JavaScript for building AI applications totally caught me offguard. It makes sense in the same way you're describing that these are not datascientists moving to LLMs. These are more full-stack-looking people who arepivoting towards LLMs. I think it's more product-oriented, which isinteresting.


[00:07:20.670] - Simba Khadder

What's the flip side of that? I know on the products we'veseen, it seems like there's a diversity of the amount of products. Maybe thedepth isn't there. What are you seeing on that side? Are these products comingout game-changing? Where are the game-changing ones? What are you seeing mostof around the actual things being built today?


[00:07:39.340] - Demetrios Brinkmann

Well, I think one thing that has been fascinating for me,and it really makes me realise the value of the product mindset and the productpeople within the business, is all of a sudden anyone has been thrust into thatposition. It doesn't matter really where you sit. But if AI and this incrediblynew capability that's very powerful is just an API call away and you canincorporate that into your product, you have to think about things in adifferent way.


[00:08:11.170] - Demetrios Brinkmann

You have to think with that product mindset, and what's thelowest-hanging fruit? How am I really going to define this for the end user ofthose? What's the best user experience that we can create with this new tool inour tool belt?


[00:08:25.390] - Demetrios Brinkmann

But going back to the question, I think one thing that'sbeen incredibly clear is there's the application layer now. Whereas with MLOps,I think it was very hard to define a clear stack. With LLMOps, you are seeing aclear stack form. When we put in the report, like the survey that we did a fewmonths ago, it became really clear when we asked people what tools are theyusing with their LLMs in production. There was a standardization around thosetools and you had different layers and different levels. That is one piece.


[00:09:05.890] - Demetrios Brinkmann

But then when it comes to the applications and is therethings that are interesting coming out as applications, I'm not necessarilysure that the applications themselves besides… Of course, every day you hearabout some mind-blowing product on Twitter and then you go to use it and itdoesn't work quite like it said it would. But I do think that there are a lotof new ways of looking at how we're using LLMs. Of course, chat-based may notbe the best experience for a lot of these use cases. How can you go beyondchat?


[00:09:41.640] - Demetrios Brinkmann

One guy who was at the LLMs in Production Conference that wedid, the virtual conference, Linus Lee, he talked about interfaces with LLMsbeyond chat. There's these ideas as far as instead of saying you can have thewhole world and some with whatever it is that you're trying to incorporate thisLLM into, how can you minimise that and make it very, very intuitive for theuser?


[00:10:09.530] - Demetrios Brinkmann

He was doing this at Notion, which brings me to the nextpoint, which is that the real winners of the whole AI revolution, I think, arethe ones who already have the distribution and they're incorporating AIfeatures into their products that are already killing it.


[00:10:26.360] - Demetrios Brinkmann

It's not the AI-native products that are being built likeyou saw with Lensa that came out a few weeks after the AI photo of myself or"generate a profile photo of me." I think there was probably three orfour copycats. That shows you how A, it's pretty easy to get something goingwith it if you find a nice little use case; but B, it's also easy to have a lotof competition really quickly.


[00:10:56.830] - Demetrios Brinkmann

In my opinion, you have to have the product, you have tohave the distribution, and then you have to think very, very carefully as wewere mentioning before, with that product mindset and that hat on as to howyou're going to bring AI features into the product that everyone already knowsand loves.


[00:11:17.520] - Simba Khadder

I think it's a really good point. The thing that a lot ofpeople maybe have missed a little bit with this wave is that unlike pasttechnological waves, every incumbent is fully aware of how big a change thisis. Two, they're sometimes moving faster than the startups. We're seeing thesebig companies get out their GPT-inspired features sometimes at the same time asa startup that's only been building that. They'll build it way more at a muchhigher level because they have data, they have more engineers on it. They'recommitting.


[00:11:53.050] - Simba Khadder

My sense is a lot of companies and startups I've seen, Iwould describe them as something that exists with AI sprinkled on. It's almostlike, "Oh, it's Notion but with AI. Well, Notion is going to build Notionbut with AI." I think the only places where we will see it is when themoat is more like Notion can sprinkle AI on but they can't rebuild from scratchas an AI. What would Notion look like if it was LLM-centric? If it was using aworkflow of that more.


[00:12:24.680] - Simba Khadder

I don't know what that is and I know whoever answered thatquestion may or may not build a really, really big company. But I think there'scompanies that just were never possible before. Lensa is an example of that.But you need to also do that in a way that has enough complexity and depth.That's not a toy. I almost think of what's a workflow…


[00:12:46.540] - Simba Khadder

If you look at sales. I feel like I know 4,000 peoplebuilding an AI-for-sales-type company. Most of them are just like, "Well,how about Salesforce adds this? Or how about if, whoever, Outreach addsthis?" The answer is, "Well, it's similar but we'll movefaster." It's like, "You won't. They're moving just as fast." I thinkit's more like how if you were to reimagine the CRM and not make it… What wouldthat problem in solving look like if it was solved natively with an LLM? Again,there might be nothing there. I'm sure in a lot of the verticals, there will besomething. I think that's where the big companies will come from.


[00:13:21.370] - Demetrios Brinkmann

Do you think the big companies have been able to incorporateAI into their products so quickly? Because that's just another testament forwhy this AI revolution has come on so strong is because it is so easy to get upand running and incorporate AI into your products.


[00:13:39.620] - Simba Khadder

I think it's exactly right. I think when you think of ML,when it started, even recommender systems, et cetera, it took a while forcompanies to catch up and do a decent job. Imagine it was as easy to go technative in the dot com era. Maybe Amazon would just be Walmart. I think in theAI world, it's like Amazon was built in a way that it was just really truly… Itfeels funny now, it's been so long, but it was truly like a dot com native. Itwas Internet native. I wonder what companies are going to be AI-native today.It's much easier, but the bar is higher.


[00:14:18.950] - Simba Khadder

You need to revolutionise something or revolutionise theworkflow in a much more dramatic way. I also think you have to do it in a waythat doesn't replace people, which is the other thing I see is the morerevolutionary LLM companies are more like, "We'll replace this whole function."I just don't believe LLMs are there yet. I'm not even against the idea of,"Hey, if we can automate everything out and me and Demetrios can go justsurf on an olive farm in Italy all day and not have to worry about like, I'mfor it." I'm not against the idea. There's going to be a really painfulperiod in between, but it's more like the augmentation is where we're at.


[00:14:53.100] - Simba Khadder

I think where the biggest use cases and value is coming fromis how do we make people… Even my engineers, everyone of my engineers use GPT.It makes them significantly better at their roles. In the sense of a lot ofhard problems, you can just end up banging your head against the wall for a fewdays, a week.


[00:15:09.540] - Simba Khadder

Being able to just have this oracle—we've literally calledit the oracle that we talk to—it can make you maintain a very consistent paceof getting stuff out and not getting stuck and making sure you do things rightbeyond just the, "Oh yeah, it's automating." It does automate a littlebit, automates writing some form of tests, but it's not really where I thinkthe value has come from as much.


[00:15:30.010] - Demetrios Brinkmann

I hear a lot of people talk about like, "Yeah, we needto rebuild from the ground up with this AI-first mentality." I'm a littlebit like with the crypto scene where it's like, "Everything needs to bedecentralised." I feel like it's that same thought that goes through myhead, that went through my head back in the crypto days as far as like,"But is everything better if it's decentralised? I don't know about that.Is every experience better if it's AI-native? What does that even mean really?It's something that you can say and people are like, 'It's going to get therein a few years, then you'll understand. But right now we just got to make thebuilding blocks for it and rethink how it works.'" I'm a little bit like,"Okay, I guess either I can't see around the corner or I'm just tooskeptical."


[00:16:19.530] - Simba Khadder

I think it's a really good point. I think that the same wayI'm skeptical of crypto. It's almost like you're describing an end state, andI'm not sure the end state is a thousand times better than the current state.Because of that, it's not really clear how we're going to get there becauseit's not binary, but you need to have value along the way. It needs to be apath. It has to be incremental. I think you're right that there's a lot of hypeand a lot of just grandiose claims.


[00:16:47.440] - Simba Khadder

But I think there are some simple… I think data analytics.If you think of how different data analytics is when you can ask an LLM forhypotheses of like, "Hey, this is what my data looks like. Give me somethings that you think I should think about." It's just like, "Oh, that'sa good idea. That's not a good idea. This makes me think of this other goodidea." You start thinking of what my day-to-day life would look like as adata analyst of an oracle that's very confident and sometimes right, butusually in some sort of generic area of rightness.


[00:17:18.820] - Simba Khadder

You can't trust it fully. It's not like you can just tell itgo do the thing, but you can use it to help you work better. That willdramatically change the data analytics process. Things that used to be really,really important become less important, even for writing. When I write, I useGPT constantly. It's like, "What would my workflow look like if, ratherthan switching between these two tabs and rather than just like, 'Oh.'" Ihave in my Google Doc, I hit a thing and it generates a summary. Is theresomething that is more natural?


[00:17:49.110] - Simba Khadder

I don't know what it is, and I don't think a lot of peopledo, and I don't think a lot of people are thinking about it this way, which I thinkis part of the problem going to what you're saying. Even in MLOps, I think oneof the biggest problems that we ran into was there wasn't really a lot ofproduct-oriented founders who came into the scene. It was very tech-orientedfounders and very sales-oriented founders. I think what we were missing wasjust really, really good products and there were very few in MLOps. I think inLLMs, we're going to see the same thing.


[00:18:16.360] - Simba Khadder

There'll be this rush of money and there'll be a lot ofshitty products. There'll be a lot of hype-y products that sell a lot and thendisappear. There'll be a lot of products that are you need a PhD to understandand definitely aren't going to be driving value. Then there'll be this middlelayer which will be the rare ones that actually accomplish the goal, in myopinion. Do you buy that? What do you think?


[00:18:36.980] - Demetrios Brinkmann

Yeah, I can see that. I think that happens, of course. We'rein one of the sectors that is still getting heavily funded despite the currenteconomic period. I think that's probably why there's so many people that aregoing into this field. Again, if you can go into the field then you don't needthe PhD in machine learning to understand everything. The barrier to entry ismuch lower. I could definitely see what you're talking about.


[00:19:08.320] - Demetrios Brinkmann

I also just want to go back to what you were saying with theidea of these apps that are going to be much more harmonious with how we dothings and what we want to do. I was just thinking about how probably… Some ofthe biggest pains that I have, it's not that AI could fix it. It's that I justam switching from a million different tools and tabs and portals on mycomputer.


[00:19:24.340] - Demetrios Brinkmann

One of my biggest pains is that I have to enter in mypassword every once in a while and be like, "Oh, because it's through thiswhatever app, it doesn't remember it from my password manager." Then tellme, AI is not built for that kind of stuff. It's not going to save me fromthat. We're still struggling with those kind of things, so I'm a bit skepticalthat AI is going to revolutionise the world in the way that we think or as muchas we think until we can figure out how I can get my damn password manager towork inside of my apps on my phone.


[00:20:13.310] - Simba Khadder

Once Zoom works consistently and we don't have peopletalking when they're muted, then we've earned the right to revolutionise theworld of AI. One step at a time. Let's dig in [inaudible 00:20:23] a bit. Let'stalk about ML. ML, is it boring now? Are we done? Is it going away? Whathappens now?


[00:20:29.230] - Demetrios Brinkmann

I think I told you this before we hit record. One of thedirty truths right now is that… what my intuition is, is that the majority ofthe money being made in this AI world is being made by companies that areexecuting on their ML use cases. They have strong ML teams, and they're provingout value for their companies or enterprises with machine learning. Maybe it'snot this sexy AI, and they can't talk about the newest models that they'reusing or how they have a billion parameters or trillion parameters, whatnot.


[00:21:08.730] - Demetrios Brinkmann

But it is very valuable to the company in ways that there'sa few use cases that they have proven themselves over the years to be stronguse cases for ML. Now, is it as many as we thought five years ago? Maybe not,but I think it's pretty clear that recommender systems are vital for companiesthat use recommender systems.


[00:21:31.680] - Simba Khadder

Yeah. I mean, fraud detection, generic anomaly detection,obviously recommenders, there's a set of use cases that we see pretty much atevery company. Every Fortune 500 has multiple teams doing these things. In theNLP space, I think we'll see bigger dramatic shifts. But just coming from abackground recommender systems, there are interesting use cases where you canuse LLMs, but I don't think a core recommender system like the YouTube sidebaris going to change at all because of the existence of LLMs, for a variety ofreasons, from it's not the right tool for the job. It actually probably won'tdo as well. It's too slow, too expensive. It is really, really expensive to runthese things. Think of how much money open AI burns on every single call youmake, even if you're paying them. They're taking a huge dent for datacollection and just generic market capture.


[00:22:21.010] - Simba Khadder

But, yeah, I think ML… Actually, the other piece of this I'veseen—I don't know if you've quite seen this area of it—is that where VCs areinvesting heavily… They're faking, "Okay, what's going to give me returnsin 10 years? That's how they have to think. Which one of these companies aregoing to return $10 billion? We've just talked about how those have to beexceptional. You can't just be like another CRM.


[00:22:43.690] - Simba Khadder

If you think of it with that lens, it makes sense that AIlooks like one of the very few frontiers where you really the next Google-typecompany come out of. So it does make sense that they're investing there. Isthere enough quality companies to make the money going in? Makes sense. Maybenot, but it seems like, "We'll see." But I think on the other end, ifyou think of sea-level people, large enterprises, they're doing the same thing,but they're doing it a different way. They have budget, and we're like,"Where are we investing our budget?" We've both seen that they'reinvesting a lot of budget into AI. Every company, every Fortune 500 has had theCEO go on stage publicly and be like, "We are investing in AI."


[00:23:24.190] - Demetrios Brinkmann

Now they have to.


[00:23:25.330] - Simba Khadder

Yeah, they have to. The problem… Yeah, exactly. It's a greatway to make your stock go up. But I think the thing that we've seen is, let'ssay you're a major bank. You have $100 million, you want to invest into AI. Youliterally can't put that to use for LLMs. There's not enough there yet. You canput together, like a specialist… Especially when every other company is doingthat, too. You're competing with all of them for talent and for other things. Ithink what we'll see and what we're seeing, what we're personally seeing atfeature form is they're putting like 90% of that budget to traditional ML usecases because they know they'll see ROI there and the view that it's anextension and a continuation and not necessarily a heartbreak of like,"Oh, all this stuff is deprecated now. AI is the future."


[00:24:09.790] - Simba Khadder

I think they view it more as, "Well, if we get our datain order and we get models in production, we can view LLMs as a special kind ofmodel as opposed to a whole new paradigm." The same way I think that deeplearning didn't necessarily destroy random forests. They're still around.They're still everywhere. They're probably the most deployed mall in the world.It's like some form of a random forest. I don't think that will change.


[00:24:29.310] - Demetrios Brinkmann

It's so funny you mentioned this idea of ROI because goingback to that survey we did, the survey was like, we had a bunch of people thatwere using or not using LLMs in production. They filled out a lot of questions,and the questions were very open-ended. We gave them just a free text box torespond, which I later learned isn't the best way to do that. It's a lot morework on the back end if you have 150 responses from people and whatever, 20questions each. It's all just long text boxes that you have to read andinterpret when you want to create some report and try and standardise theanswers and bucket them.


[00:25:12.610] - Demetrios Brinkmann

I know for the next time I'm not going to do that. But onething that was blazingly clear was that it is not clear the ROI that you canget from bringing LLMs into your use cases. Let me break down why that is,because what people were saying is maybe there's a bunch of differentincredible reasons for that that people talked about. One is saying, "Hey,you know what? We add an AI feature to our product, like Chat GPT call, and itenriches our product, but we can't charge any more for our product. So now wejust cut our margin because we have to pay for the Chat GPT calls."


[00:26:03.200] - Demetrios Brinkmann

However, then other people say, "Well, we're focusingon affecting XYZ metrics. We see that if we can affect this metric by adding AIto our product suite, then we'll have better conversion rates and that will payfor itself." On the other hand, it's also not clear if you're going tobring the models in-house, how you can justify the ROI of creating a whole newstack and having people that understand how to use and serve these models anddo everything so the resources that you're deploying. The last thing is, how doI justify the time and energy that I'm putting into this and that I'm notworking on something else? There's all kinds of great questions around the ROIof using large language models. I think people are having a really hard timejust bringing it up and championing it and giving a clear answer.


[00:27:02.270] - Demetrios Brinkmann

But like you said, everyone and their mother has to have anAI story these days, so at this point in time, it doesn't really matter if youcan't prove out the ROI, it's just, how can we add AI to our business and ourproduct?


[00:27:15.890] - Simba Khadder

I think the technical flip side to that, which I've alsoseen, is the stupid patterns that we're seeing around how the systems actuallylook in practice because no one is actually measuring the quality of the predictions.Recommender systems, even back when I was doing it in 2016 and 2017, embeddingshad become a core part of the process. We've been working with embeddingsbefore this whole boom. We've learned a lot of lessons in the recommendersystem space on how to think about and how to evaluate embeddings.


[00:27:48.000] - Simba Khadder

The other piece of… Especially the RAG. RAG, the retrieveaugmented generation style of LLM systems, I think, is coming out as thewinner, the core, the right way to do these things. In most companies, I thinkfine-tuning has its place, but I think RAG is much more likely to find like 90%of deployments will look more like that than fine-tuning.


[00:28:11.220] - Demetrios Brinkmann

Way less overhead, too.


[00:28:12.530] - Simba Khadder

Way less overhead. It's more complicated for sure. But Ialso think that there's a lot of issues that you don't run into withfine-tuning. Fine-tuning, it doesn't memorise well. It more is like a style. Itunderstands the style well. It won't memorise the numbers you put in, andoftentimes that's what you want it to do, or when people use it, that's howthey think that it works. They think of it like traditional training where itremembers this thing, but really it's not. It's remembering this general flowof the sentences.


[00:28:36.920] - Simba Khadder

Yeah, I think on the other side of fine-tuning versus RAG isa fine-tuning, you run the risk of… The text is embedded into the model. So ifyou have private text that you're using to fine-tune, I might be able to belike… It's the same way in CoPilot. People will sometimes start a comment of acomment that they know exists and is unique and then it will spit out thatexact paragraph of code. I think the same thing would happen in fine-tuned models.I think that where we'll see interesting value is going to be more in theRAG-based where you can really keep the context literally to that request andyou can make it very, very clear. It becomes almost like a structured querywhich is what we're seeing as opposed to a more training a model which I thinkis fascinating.


[00:29:20.730] - Demetrios Brinkmann

But I cut you off when it comes to the embeddings and theRAGs and how those are becoming the champions of this scene.


[00:29:30.310] - Simba Khadder

Yeah. I think the evaluation is fascinating because thepoint of RAG, the more common, very straight-line use cases like I have thesedocuments, I chop them up, I embed them. When I make a query, I retrieve themor I retrieve N number of documents which are related to that query. This is,in my opinion, like with a "Hello, World!" of how to do this. It'snot dumb, but if you really think about it. The goal of this thing is tocontextualise your query, right? The goal of it is to add as much information that'srelevant to the model to do its best work. That's a point of RAG, just randomlychopping up documents and grabbing N documents and just throwing it in andbeing like, "Job well done. I wrote five queries and it looked better thanthe generic one," is not the right way to do this. I think we need to bethinking about, well, what's the maximum information gain for each document?


[00:30:27.870] - Simba Khadder

The other thing that a lot of people I notice aren't doingas much, but we're starting to see even like more traditional feature store usecases and this traditional key value data. Let's say you're building afinancial bot. You're just trying to like, what should Simba do with hisfinances to retire by end day? You probably want to know my age, you probablywant to know how much money I have. You probably want to know things that I,maybe, say about myself, like my risk appetite, my spending habits, how much doI spend on avocado, toast, and lattes? Probably too much. But when you think ofall that stuff, those are not vector DB operations.


[00:31:02.550] - Simba Khadder

There's this funny graph databases are the solution. A lotof these answers are just like, a SQL database is the solution. We just need tocontextualise our queries properly and we just don't even know how to do that,which is a problem to be solved. But I think the bigger problem, which I'm moretrying to solve first, is people to realise what they're even doing here. Ithink we're just following a pattern blindly and being like, "Well,embeddings are magic, so it works." embeddings are magic, don't get mewrong. But not anyone who's in recommender systems can tell you that sometimesthey do really, really crazy and dumb things.


[00:31:32.720] - Demetrios Brinkmann

Dude, that's classic. I think we will see, as time goes on,the improvements of using both LLMs and traditional machine learning together.I really like that you brought up that point. I also wonder, because I knowback in the day you had embedding hub, and that was almost like the firstvector database in a way. You're a man before your time. What happened there? Inever got the full story on that.


[00:32:01.790] - Simba Khadder

Yeah. We still get hit up a lot about embedding hub. We havea document that I should probably change up. We were one of the first vectorDB.


[00:32:08.450] - Demetrios Brinkmann

It's so funny too, because now everyone is running towardsvector stores. At that Databricks summit, when we saw each other, I think justabout every database company that was on the floor, they all were talking abouttheir vector solution or vector store, vector DB part of their database.Anyway, what's the story? What happened there?


[00:32:34.090] - Simba Khadder

Part of it is I saw that coming. We were early to it becauseI've probably built a vector database four times in my career, and I'm not theonly one who's done that. I know even some of the vector DB companies, theywere originally built not with LLMs in mind. They predated LLMs, you think ofWeaviate, you think of Pinecone. These are not people who predicted the RAGstyle coming into play. But they we're much more focused on recommender systemstyle, NLP style, even semantic search, image search, those sorts use cases.


[00:33:05.740] - Simba Khadder

We viewed embeddings as a feature, the same way that yourage is a feature. We thought it was something that we needed a core databasefor. Where Redis and some of these other key-value stores solved that part ofthe equation, at the time, none of them really had a good vector DB solution.Now, Redis has a good one, and as you've mentioned, a lot of companies arecoming out of good ones with the traditional databases.


[00:33:29.400] - Simba Khadder

We were like, "We need to build against somethingbecause we know the future is going here." We were like, "Well, wecan just build our own and build against that as the API. Now, it started toget its own traction, which is cool and interesting, and we actually decided tovery much pick a path. We couldn't do both, couldn't do the feature store andthe vector DB. We chose the feature store because to this day, we still viewthe problem of feature orchestration, feature metadata management, those sortsof problems, or organisational problems of data and features is the sharpestproblem to solve and the hardest one to solve and also the one where I thinkthat is the biggest company to be built. I think the problem with vector DBs,and again, why we moved away from it.


[00:34:09.270] - Simba Khadder

What vector DB is in essence is you're taking an index, andindices existed back in the day. It's not like we created our own index ofdoing approximate nearest neighbour. We just used one, same as all the othercompanies, but we built everything else that makes a database around it, likedurability, replication, all the things you'd expect from a normal database. Ijust came to the conclusion that, hey, this index is hard to implement at a bigdatabase, but if the carrot is there, they can all do it and they'll all figureit out, and I think they'll all be able to do it better than at least my teamwould have been able to. We aren't exactly database people. I wouldn't be ableto take that algorithm of approximately nearest neighbour and really, reallymake it that much better. At least I didn't feel like we could do it betterthan these other PhDs who literally did their PhD in database systems. So wedecided to cut it.


[00:34:56.270] - Simba Khadder

Part of why we built it was for the same thing that we'restill seeing, which is a lot of the problem space around LLMs seems to just be,"I have documents. I need to break them up. I need to embed them. I needto do retrieval. I need to fit it together into a prompt." We've alwaysdefined features as input to a model, so a prompt is an input to a model typeof feature. An embedding is a type of feature, and the analytical feature is atype of feature. If you view us as this orchestrator above all of it, thenthat's problem we solved. An embedding hub was just a thing we did to solve ourown need and build something to build against. We considered going with it, andwe just didn't view that there was a path forward that we were necessarily theright company to build.


[00:35:38.770] - Demetrios Brinkmann

I see. Dude, it's fascinating to me because you scratchedyour own itch, but then you realised, "Well, maybe it's not the best thingfor us at this moment. We'll shelf it." Then all of a sudden, It blew uplike crazy. I can't imagine you sitting there and being like, "Wait.Vector stores are the biggest thing. They're the coolest thing since slicedbread right now. What are we doing with embedding hub?"


[00:36:05.020] - Simba Khadder

What's funny is it never fazed me. I never was tempted tobring embedding hub back from the dead.


[00:36:10.250] - Demetrios Brinkmann



[00:36:10.450] - Simba Khadder

Because yeah, I was always doing long-term. Sure, if I wasrushing to get to a billion-dollar valuation on paper fastest, yeah, I'd belike, "Guys, embedding hub, we're going to go all in. I'm going to go talkto every VC. I'll go do the Silicon Valley thing. I'll go, whatever." Butit's not my goal. My goal is to build that big a company, but to build itright, which means to actually have revenue, to back it up, actually have acompany that is sustainable over a long period of time, but solves a bigproblem. I think that in the vector DB space…


[00:36:41.100] - Simba Khadder

I think just a vector DB alone, in my opinion, the bestthing that can happen if you're just that is you become at best, like a Neo4j,which is great. Neo4j is a great company, but probably not the size and scalethat a lot of these vector DBs are aiming for. I think for them to becomebigger, they need to pivot to becoming these generic retrieval databases, whichdoes not just mean adding more indices. It means building something at a higherlevel of abstraction. I think Weaviate's done a good job at this, but I stillthink that there's more to be done there where other companies have stayed atthe lower level. If you're going to stay at the lower level and you're Redis,it makes sense because Redis has everything. We're like, "We're a databasefor everything. We just put your data in us and then you pay us a lot ofmoney." But if you're a company that sits above that stack or sits aroundthat stack, it's a different game.


[00:37:27.370] - Simba Khadder

I guess if you think of things [inaudible 00:37:29] evenMLOps, if I was trying to be like… You as well. If you were trying to be like,what's the fastest way to get a ton of hype? You would have renamed MLOpscommunity yesterday. You would have put out articles like MLOps is Dead, LLMsAre the Future, and stuff like that. But you're pragmatic enough to see that,"Hey, you have to be on this. You can't ignore it. It's not going to goaway and it's not fake." It's hard to make the crypto argument on AI. Butwe're still really early and we're definitely at a peak of a hype wave thatwill disappear, and then there will be a secondary wave where the actual ROIcomes in, which is what we're seeing of MLOps. Funny enough, we're in thatsecond bump-up. Now, people are actually buying MLOps. They weren't buyingMLOps two years ago. Everyone was talking about it two years ago, but no onewas signing checks.


[00:38:14.120] - Simba Khadder

Now, people are signing checks because it's become real.


[00:38:17.010] - Demetrios Brinkmann

Yeah. Also, I think the maturity level, people know whatthey're looking for, they know where their pains are, and they also understandwhat's out there and what needs to be in your ML lifecycle. Whereas before,everything was so new that it was hard to get that information. That's onething that we tried to do, is just make sure that people understand. You needto think about your maturity when you're doing machine learning, and you needto think about what is right for you in each point in time, because I can'ttell you how many times people would be like, "Yeah, it's me and me on myteam and I'm going to set up Kubeflow." It's like, "Whoa, let's talkabout that for a minute. Do you really need Kubeflow? Are you sure about that?Have you tried to play around with Kubeflow at all? Maybe go and try andinstall it, see if it doesn't crash your computer a few times before you championthat one to your boss?"


[00:39:18.930] - Simba Khadder

Yeah. Actually, it's funny you mentioned LLM ROI. The MLOpsROI story was the same thing. Early days of MLOps, everyone's like, "Yeah,this is really interesting. We can't ignore it." It's obviously somethingthat we know we're going to do one day, but we haven't fully understood the ROIyet, and now it's much more understood. It's boring. But that's where peoplespend money. Most of the money these enterprises spend is on boring things thatthey understand really well. No big companies like, "I'm going to spendlike 20% of my budget on this super crypto." It just doesn't happen. Forgood reason because things need to get figured out now.


[00:39:52.490] - Simba Khadder

Anyway, I know we're actually at time, and I feel like weshould do another episode soon because I feel like there's still so much more Iwant to talk about, but I want to stick to time, so I'm going to cut us off fornow. But Demetrius, thanks so much for coming on and answering all my questionsand having this great conversation with me.


[00:40:08.010] - Demetrios Brinkmann

Always a pleasure talking to you, Simba. I look forward tocatching up with you in person and potentially seeing you at one of ourdifferent meetups around the globe.


[00:40:17.610] - Simba Khadder

Yeah, I'll be there.

Related Listening

From overviews to niche applications and everything in between, explore current discussion and commentary on feature management.

explore our resources

Ready to get started?

See what a virtual feature store means for your organization.