DeepSeek R1: Ask Me Anything - Open Source, Open Weights, MoE innovations, Model Distillation and more!

In this episode of the AI Native Dev podcast, we dive into the transformative world of AI with a focus on DeepSeek R1, a revolutionary open-source AI model. Join our expert panel, including Amanda Brock, Richard Sikang Bian and Guy Podjarny, moderated by Simon Maple, as they explore the implications of this development for the tech industry and beyond.

Episode Description

Join Simon Maple as he hosts a compelling discussion at the State of OpenCon, featuring Amanda Brock, CEO of OpenUK, Guy Podjarny, Founder and CEO of Tessl, and Richard Sikang Bian from Ant Group. This episode delves into the impact of DeepSeek R1, a groundbreaking AI model that has taken the market by storm due to its cost-effective and innovative approach. The panel explores the broader implications for open source AI, discussing legal, ethical, and collaborative aspects. Amanda Brock provides insights into the open source legal frameworks, while Guy Podjarny shares his expertise on transparency and innovation. Richard Sikang Bian offers a unique perspective on company culture and AI development strategies. Tune in to understand how DeepSeek R1 is setting new standards in AI and what the future holds for open source technology.

Chapters

1. [00:00:00] Introduction and Welcome
2. [00:01:00] Overview of DeepSeek R1's Market Impact
3. [00:03:00] Guest Introductions
4. [00:06:00] Understanding Open Weights Models
5. [00:10:00] Cost-Effective AI Model Training
6. [00:14:00] Legal and Ethical Considerations in AI
7. [00:19:00] Global Collaboration and AI
8. [00:23:00] The Future of Open Source AI
9. [00:27:00] Discussion on Model Distillation
10. [00:31:00] Final Q&A and Conclusion

The Rise of DeepSeek R1

DeepSeek R1 has made a significant impact since its introduction to the market. As Simon Maple highlighted, the model's entry "really was a big splash in the market" due to its cost-effective approach, spending only $5.6 million on computing power compared to OpenAI's estimated $100 million for a similar model. The market reacted profoundly, with major indices and companies like Nvidia experiencing fluctuations. Amanda Brock noted, "the market shock waves were pretty substantial," emphasizing the broader economic implications of DeepSeek R1's emergence. This underscores how advancements in AI can ripple across the global economy, affecting industries far beyond technology.

Furthermore, the cost reduction in training AI models like DeepSeek R1 suggests a paradigm shift in AI development. With reduced financial barriers, a diverse array of companies, including startups, can now enter the AI field, potentially leading to a democratization of AI technology. This democratization could foster a more competitive and innovative market, providing opportunities for new players to emerge and contribute to AI advancements.

The Open Source Perspective

Amanda Brock provided a legal perspective on the open source aspects of AI models, contrasting DeepSeek with Llama. She stated, "DeepSeek is more open source to me as a nuanced lawyer than Llama," highlighting DeepSeek's use of the MIT license, which complies with the open source definition. The significance of such licensing is crucial for ensuring that models are accessible and can be used for any purpose, thus promoting further innovation.

The legal frameworks governing open source AI are integral to the technology's evolution. By adopting licenses like MIT, organizations signal their commitment to transparency and collaboration. This approach not only fosters trust within the developer community but also encourages contributions that can enhance the model's capabilities. Understanding these legal nuances is essential for stakeholders who wish to engage meaningfully with open source AI projects.

DeepSeek's Impact on AI Development

DeepSeek's cost-effective model training has profound economic implications. As Richard Sikang Bian explained, the reduced costs democratize AI development, allowing more companies to participate in the innovation process. Bian noted, "the more players, the merrier," emphasizing how DeepSeek's approach can lead to a more inclusive and dynamic AI development landscape. Additionally, he shared insights into DeepSeek's unique company culture, where young talents are empowered to innovate, further driving technological advancement.

This empowerment of young talents is not just about economic efficiency; it’s about fostering a culture of innovation. By giving young developers the freedom to explore and propose new ideas, companies can cultivate a fertile environment for breakthroughs. This cultural shift is vital for the sustained growth of AI, as it encourages diverse perspectives and novel solutions that can address complex challenges.

The Concept of Open Weights Models

The concept of open weights models is pivotal in the AI industry. Guy Podjarny clarified, "open weights models are usable... you can download them, you can use them," but he also pointed out the limitations, such as the lack of transparency in the training data and source code. This raises important questions about the potential for community-driven innovation in AI, as the ability to contribute and fork models is essential for fostering collaboration and further advancements.

Open weights models represent a step towards transparency, yet they also highlight the challenges of defining what "open" truly means in the context of AI. While the accessibility of weights is a positive development, the absence of open training datasets and methodologies limits the potential for community contributions. Addressing these limitations is crucial for realizing the full collaborative potential of open source AI.

Legal and Ethical Considerations in AI

Amanda Brock addressed the complexities of data ownership and intellectual property in AI. She emphasized the importance of defining data rights and privacy, noting, "we don't progress... until we work out how that data and information can really be opened up." Creating a robust legal framework for open AI is essential to navigate these challenges and ensure that innovation continues without infringing on rights or privacy.

The ongoing debate around data rights and privacy is a critical issue for AI developers and users alike. As AI systems become more integrated into daily life, the need for clear, enforceable regulations that protect individual and organizational interests becomes increasingly apparent. By establishing clear guidelines, stakeholders can mitigate risks and promote ethical AI development.

Global Collaboration and Competition

International collaboration is crucial for AI research and development. The panel discussed geopolitical factors, including the proposed decoupling AI from China Act. Amanda Brock argued against it, stating, "we should be focusing on global collaboration," as it fosters innovation and benefits society. Guy Podjarny added that while competition is important, restrictions could hinder technological progress and market opportunities.

Collaboration across borders enhances the diversity and richness of AI research, leading to more robust and adaptable technologies. However, geopolitical tensions can pose significant challenges to such collaboration. Navigating these complexities requires diplomatic and strategic efforts from industry leaders and policymakers to ensure that global AI development continues to thrive.

The Future of Open Source AI

The Model Openness Framework marks a significant step towards open source AI. The panel envisioned a global consortium akin to the Linux Foundation to lead open source AI initiatives. Guy Podjarny expressed hope for "scale and contribution in open models," emphasizing the potential for a collaborative foundation to drive AI development. Such efforts could lead to a more transparent, inclusive, and innovative AI ecosystem.

The future of open source AI hinges on the ability to unify diverse stakeholders under a common vision. By establishing a centralized consortium to oversee AI initiatives, the tech community can ensure consistent standards and practices that promote innovation. This collaborative approach will be key to unlocking the transformative potential of AI technologies for global benefit.

Full Script

**Amanda Brock:** [00:00:00] So we all have different lenses that we already look at open source from. And when we look at DeepSeek, so DeepSeek is more open source to me as a nuanced lawyer than Llama. And the reason is that DeepSeek, what it has released is released and shared on an MIT license, which is an approved open source license complies with the open source definition and it meets the standard that anyone can use for any purpose, what has been released.

**Simon Maple:** You're listening to the AI Native Dev. Brought to you by Tessl.

Thank you all for coming. So first of all, I'll say a big thank you. And welcome to everyone in the room. At State of OpenCon. And the reason I Specifically say State of OpenCon is that we're also live streaming directly to the AI Native Dev. And that's because we were planning about doing [00:01:00] an AMA and we were chatting with Amanda and she kindly allowed us to do this in combination with State of OpenCon.

So thank you very much to that. This will also be one of our podcast episodes next week on the AI Native Dev, so hello for people listening there, for people listening live, and also in the room. This is indeed an Ask Me Anything, so we will be having mics wandered around the room. It's very important for us, for this format to work, for everyone to ask the burning questions that are on your mind.

We have an amazing panel here that can fill those questions. I'll give a very brief intro to, to some of the topics that we're going to be talking about here. Then I'll ask each panelist to introduce themselves. So I would say there's a number of different ways in which we can talk about DeepSeek, but just for those who.have been living under a rock for the last couple of weeks. DeepSeek is a new LLM model which has come out of DeepSeek is a Chinese AI company and they've built a number of models. The one that everyone's talking about now is DeepSeek R1. When it launched, just ten days, I think you were [00:02:00] saying ago, ten days ago.

There really was a big splash in the market. We saw 5.6 million dollars only spent on computing power for the base model compared with what people are estimating around 100 million for OpenAI's o1 model and yeah, the market shock waves were pretty substantial. I've got some numbers here. The S& P 500 fell 1.4%. The Nasdaq dropped 2.3%. Nvidia shares dropped 16 percent just by someone else's model coming out. Nvidia, interestingly, were worth about 3. 6 trillion dollars. That went down to 2. 9 million dollars. So that's what? Trillions. Trillions. Trillions, sorry. It wasn't that bad, was it? 700 million dropped in just a day or two.

Incredible. The market has recovered somewhat though, thankfully. But this obviously has had a big impact. I'm gonna ask some questions in a second, but please, Amanda, you, the person, the one person on this panel who needs very little [00:03:00] introduction, but why not, why don't you introduce yourself to those on the stream as well?

**Amanda Brock:** Okay. Good afternoon everybody. I'm Amanda Brock. I'm CEO at OpenUK and we are the industry organization for the business of open technology in the UK. So software, hardware, data standards, and of course, AI. And we've been hosting our annual conference, State of OpenCon, here in London for the last two days.

We are now on the final hurdle, we're almost done. And we've had several hundred, many hundred people come through the doors over the last couple of days and join us in big discussions about openness, but particularly openness in the context of AI.

**Simon Maple:** Amazing. And for those online who would be interested in learning more about that, where would they go?

**Amanda Brock:** OpenUK. uk. I didn't buy the domain.

**Simon Maple:** Okay, . No worries. Guypo?

**Guy Podjarny:** I am Guy Podjarny or Guypo. I am the founder and CEO of Tessl probably a bit better known for being the founder of Snyk before which continues to do well. But I got drawn into the the world of AI. And, I am an addict and back to entrepreneurship path on it.

At Tessl we're looking as you may know to really reimagine software development for the [00:04:00] era of AI. And they've always been a long time open source kind of champion and passionate about doing that including in the world of AI. Wonderful.

**Simon Maple:** And finally, Richard.

**Richard Sikang Bian:** Hi, everyone.

My name is Richard Sikong Bian. I always begin my introduction by saying that, my last name is Spanishbian. So it means I'm a good person. My first name is this is perfect because we're in UK now. It's The scone as in the cookie, so I'm a pretty good cookie. Maybe that's why I got invited to here, to this panel.

Yeah, so I've been working with Ant Group for four years and a half. I actually built like the Ant Group open source program office. I have been working on open source since then. In my last life, before I joined Ant Group, I've been working at Microsoft and Square as a software engineer. So I've been on the technical side and now I'm working on technical strategy aspect, reporting to the city office.

Yeah, it's my first time in the UK actually. But it has been a pretty long stretch flight and I flew from Hangzhou. I think that's maybe one of the reasons, another reason why I got invited to this panel. Because DeepSeek, they're based in Hangzhou.

But and then the truth [00:05:00] being told.

**Amanda Brock:** They are, and so their terms and conditions all go back to Hangzhou. We'll talk about that later, I'm sure.

**Richard Sikang Bian:** Yeah, but there are 11 millions of people in Hangzhou, okay. Yeah, not everyone is well connected, but very glad to be here. Thank you.

**Simon Maple:** Has it rained yet since you've been here?

**Richard Sikang Bian:** Oh, yeah.

**Simon Maple:** It has actually.

And have you eaten Indian food, our national dish?

**Richard Sikang Bian:** No, actually I'm pretty under the weather today. I just showed it. Oh yeah. That'll definitely weigh you up. That's also part

**Simon Maple:** of the British experience. The full British experience. Yeah, absolutely. Amazing.

Feel free to pick your hands up when you have questions and we'll have mics that are dropping around. I'll kick off just quickly before we jump to that first question. And this is just to mention again some of the things that we brought up earlier and that Guypo was talking about.

How open is DeepSeek? R1?

**Guy Podjarny:** Yeah, so should I kick us off the DeepSeek R1 and V3 before it. are open weights models which implies they are usable, like they're weights, you can download them, you can use them just like Llama or some Mistral's models, etc. How open is an open weights [00:06:00] model is an interesting question.

I've got a bit of a beef. I mentioned in an earlier session with the term open for it, I think of it a lot more like freeware. It is free to download. You can further tune it post, you can self host it, you can do a lot of that stuff, but is it open? Not so much. What is open is the weight, the sort of the output of the training data.

And so that we can use that, you can continue from it. But a lot of the makings of an open weights model, and that's true for DeepSeek as well, are not open. The source code to create it is not open. The data that was fed into it is not open. And the whole training process of taking that data with that source code through an iterative process to get it into those weights is not open.

That is not unusual for open weights model. That's also true for Llama. It's also true for, again, most open weights models. But it's still not open per se, if DeepSeek or the Chinese government, whoever have legitimately chosen not to do the next version. The community cannot fork it and continue and evolve from here.

Nor can people contribute to it. And of course there's a transparency aspect. So there is a gap here. It's not [00:07:00] unique to DeepSeek, but just because of the geopolitical nature of DeepSeek being developed in China versus in the U. S. It comes up a little bit more about the lack of we don't know what's behind the curtain.

And and probably has surfaced as more of a conversation.

**Amanda Brock:** Yeah. So from a legal perspective, I was talking earlier about open source and one of the real challenges we have is all of these words that we're all using and we're using words to describe things and we're describing things that are very new and we're using words that we all have different meanings.

That we ascribe to. So when we talk about open source, we can look at the legal definition, but you also have your experience. So if you've come to open source in the software context and you've come to it from a sort of community perspective, you think of it as a community collaborative thing.

If you've come to it from a business collaboration. So we all have different lenses that we already look at open source from. And when we look at DeepSeek, so DeepSeek is more open source to me as a nuanced lawyer than Llama. And the reason [00:08:00] is that DeepSeek what it has released is released and shared on an MIT license, which is an approved open source license complies with the open source definition and it meets the standard that anyone can use. for any purpose. What has been released, right? So the question then is Guy's pointing out is which bits have been released. Llama is different because Llama is not on an approved license. It doesn't meet the open source definition. And the main reason for that is there's an acceptable use policy, but actually there's a requirement that if you hit 700 million users of Llama that you'll go and get a commercial license from meta and that introduces a friction that means that two things It means that you can't rely on a free flow in your usage building it into other products because you may if you're super successful have to go back and get commercial right and give Llama money or give Meta money But also it means from Meta's perspective that they've hedged their bets, right?

So when you enable people with open source you enable them in a way that you enable [00:09:00] your competition So you give your competition your innovation. And they haven't quite done that because of this clawback where if you're super successful, they're going to get money from you. So there is a piece there around that.

And then building on what Guy said about the inability to see how it's trained and see what was going on. That is all right but actually they also provided some I don't know if it's training documentation I can't remember what they call it but the sort of manual that goes with it and I know that Hugging Face are now trying to reverse compile that and look at the process

**Guy Podjarny:** Yeah,

**Amanda Brock:** exactly and they can do that because those instructions are there which I think is really interesting

**Simon Maple:** Yeah, awesome.

Let's jump into our first question

**Audience:** When I read about DeepSeek, one of the things that really struck me was that the price decrease in training on the face of it looks like it's so large that it will really change the economics of LLMs in a way that means we can have them throughout the economy without, recommissioning all our nuclear plants.

Do you think that this innovation allows [00:10:00] us to have any hopes of additional jumps in efficiency of LLM because I think that would be quite profound if this became something that's both open and also widely available. I just wonder if you can comment on that. What can we learn from DeepSeek to help us understand?

**Simon Maple:** And I was chatting with Sylvain in the hallway track, and one of the amazing things he said was, he wasn't surprised at all that they could train this model for 5.6 million, because they're competing here against an o1 model. Which was done 18 months ago. So actually a lot of the investment and work has been done very early on. So to actually replicate that is actually, pretty hard from, without doing that huge amount of research which led up to that.

Who wants to take this this question? There we go.

**Richard Sikang Bian:** Yeah, maybe I'll just provide some of the intels as far as I know. First of all, just a disclaimer, I work for Ant Group, opinions on my own. Yeah, so in fact, like when we talk about DeepSeek was only getting so popular in the past 10 days, but in reality back in the old days, actually worth mentioning their technical report for R1 is [00:11:00] very open and it's not the first time they did it.

Because if you rewind everything back into the May 2024 they actually post this V2 model with a pretty nice paper indicating a lot of optimizations they did there. And there has been a lot of adoptions within China for the V2 models and the optimization optimization methodologies by a lot of these AI companies.

So back to that question, I would say, I'm definitely imagining the cost of training will essentially cause two things to happen. One, there has been intrinsic fear from many of the companies that they can no longer play the training game. So we do see a lot of even the most popular startups, like 01 in China they change their games from training to inferences.

So now by knowing that the training can be You know, like one magnitude cheaper, it actually provides a lot of, the players coming, come back to the game again. So I can totally imagine that, the training game becomes like a more interesting one because we are working open source. We know that open source will bring a lot of those [00:12:00] kind of, like natural innovations.

The more players, the merrier. So that's definitely one thing I see. The other thing which I'm seeing is I think the DeepSeek model by itself is actually not a surprise. Because there has been this kind of six months of road pavement over there. One thing I personally learned a lot is like their way, the way that they organize their company.

It's very unique. They hire all this kind of young talents into the company with no hierarchy. Each one can actually make their own proposals and if their proposal is actually correct, they can pretty much ask for a lot of the cards just to do their own experiment. And that kind of reflects a lot about what OpenAI, at the very beginning, right?

We heard similar things. I for one is really welcoming that because it feels like it's a rare almost feels outlier cases that we see in how some of the organizations in China behaves. And I do hope that will be leading to a new wave of innovation as far as I can tell.

**Guy Podjarny:** I strongly agree with that and I think there's a lot of reasons to think that there will be additional leaps.

I think one of them is [00:13:00] well said here, which is just more people can participate in the innovation attempt where before the gate of capital was very high that's slightly countered with like, how much do you need it? They had a very, it's very much like the, necessity is the mother of all invention here, which is they had to solve some hardware problems because otherwise they did not have the hardware that was able to overcome those problems and so I think part of that necessity may be slightly diminishes So there's a risk of that but someone will figure it out and more people can participate in the game more innovators does translate to more innovations?

I think the other piece of it is this is still relatively new. You're right that V2 and then V3 after it, like there were a lot of innovations. But I do think it drew everybody's attention, especially in R1. And there's probably a wave of subsequent optimizations to this technique, whether it's from other smart people that now build on top of it or from dedicated hardware that actually does optimize for these specific use cases that have been done versus for the between in chip or counter, memory utilization that's been done before. So I think there was a lot of optimization there, but I also think that [00:14:00] there's one of the interesting things that they've done is a, is around mixture of experts and a little bit the the kind of improved ability to divide up the world's knowledge and sort of simple terms and think about, the fact that you don't need your doctor to participate or help or like the doctor, your doctor's knowledge to participate when you're writing code, like you might not need the or vice versa.

And being better able to pull in the right knowledge at the right time, which is a part of how they've done the load balancing of expertise here. Might drive us towards like with that combined with low costs might drive us towards more either specialized models or slightly silos of knowledge, which are much cheaper and much faster to run.

So I think there was a lot of this is like a, you turn a page and now there's more opportunities for innovations here might take a while until someone like literally turns another page and there's like another 10 X improvement, but I think there's a journey towards 10 X is all but guaranteed.

**Amanda Brock:** Can I just on a more sort of business and commercial level and on an innovation level, I think this is a super important piece. However, we're going to define any of these [00:15:00] words about openness. What fundamentally we're seeing happen here is that it was trained on existing models that had a level of openness about them, whether it's full open source, whether it's open innovation, open access, whatever it is.

And what that does is allow the, allows others to come and use that innovation and innovate themselves. And it's that old open source phrase about standing on the shoulders of giants and iterating and interating and evolving the innovation. And I think that's absolutely critical to everything that you're all talking about.

**Simon Maple:** And I think it's critical to the success that we've seen.

**Amanda Brock:** Yeah.

**Simon Maple:** But does that lead us into a place whereby companies are more likely to hide their models or hide their core models such that others can't replicate the success of others like DeepSeek here. I've heard rumors, I'm not gonna substantiate them, but I've heard some people believe that there are companies that are hiding V5 models right now.

That they're building other models on, but they're hiding that almost as their, their [00:16:00] gold mine that they don't want to expose because they, because others might be able to do that. Are we, the value here is clear in terms of what DeepSeek has already shown, but are we, is this behavior almost encouraging a closed angle to it as well?

**Amanda Brock:** I guess it could, right? Who knows, but what I do know, and I love this I don't know if you saw this on Friday, Sam Altman. You may have seen me with it earlier. I'm going to say it for the fifth time today. Said that OpenAI is on the wrong side of history when it comes to open source. That they should have opened their weights up and that he is looking into this.

He's personally an advocate for it. He then caveats it with it's not high up their priority list and that not everybody in OpenAI agrees with him. But that was on Friday. So at least one person is reacting in a way that's not what you're suggesting. No, I agree.

**Guy Podjarny:** I think, of all, like standing on the shoulders of giants is interesting, like the word that comes to my mind is standing on the shoulders of reluctant giants because because the I don't think it's fair to say that like distilling the sort of the frontier models is [00:17:00] necessarily building on open data.

Yeah. It is building on data that exists there who's who's right for existence is already in doubt around how did they get their data? And so there was a bit of a turtles all the way down problematic turtles all the way down problem there. But I do think that there's a fundamental question now around leaning into competition versus trying to restrict access.

It's true in the sort of U. S. China relationship, given the failure of the cheap restriction to to maintain their perceived sort of U. S. leadership. It's true in the context of the models and keeping them in. I for one think that you can't really hide technology. It's a bit of a lost battle and if they want broad use, the only way for say OpenAI to to hide their models is to really restrict access, which in turn restricts their market opportunity. And and especially given the incredible valuations that these companies are running at I think that's practically invalid not viable. So I think that's what I think the comment we're hearing from Sam Altman is more.

It's more along the lines of like you can interpret them at least as Hey, it looks like we, we lost the [00:18:00] battle of controlling distillation of our models. And at the same time, we're not winning the battle of people building on top of of us because our models are not open weights.

So we're hedging it. We're neither winning this nor winning that. Maybe we should lean into the open piece. So more people build on us. What, because anyway, we're going to need to monetize the app level either way.

**Amanda Brock:** Can I jump in just with one thing before, Richard, you do? Just something you come back to, I've noticed all day, Guy, is the data piece.

And I think we've got a real challenge here. And for the industry, I think it's the piece that all the sort of legal wonks and governance people really need to focus on. And it's data. And I don't think we need to define anything beyond working out. At least in a short time working out what the data being open means and whether it's data we're really talking about or information.

what the rights that we're worried about are, whether they're like copyright intellectual property and things, whether it's privacy, whether it's contract confidentiality, what it is and how you [00:19:00] get those rights if you need them and where you want to use them and how you want to use them. And I think that's really from a sort of lawyery kind of perspective, the bit that we need to focus on to give you the next stage.

And until we work out how that data and information can really be opened up and do it in a way that's replicable and transferable. We don't progress. We are always going to hit that block, right?

**Simon Maple:** Yep. Yep. I agree.

**Audience:** I wanted to build on a conversation in a previous session where it was discussed that now there is a definition of model openness, the model openness framework, and there's a model that's now reached class one open science level as a really big milestone in terms of move towards open source AI.

And my question to you is, what would you like to see happening next? Is something that would be a landmark moment where you can see us really beginning to gather momentum into having truly open source AI.

**Richard Sikang Bian:** Yeah, so yeah, probably tackle that. Also in conjunction to answer the last question.

I will say the I'm totally not surprised for [00:20:00] that comment because I think open is a choice. The way I'm looking at it is like information arbitrage is a competitive advantage. People use it. But it's also probably the lowest in terms of skills. If you're just using information arbitrage to as your competitive advantage, it's probably not that solid long term.

That's basically my perception towards open versus close. But it's intrinsically hard to tell people to perform in such a way that you should open. And to be more open because it's a choice. Another thing I'm looking at is it's just like when you play poker, right? So chip leaders is having different hands.

That's why, I honestly don't see intrinsic issues with some ultimate strategy, because if you're a chip leader You have this kind of lead advantage. Being closed is probably your best strategy. And now re change all your phrases is also your best strategy. But the way I'm looking at it, just like how I think Jim Zemlin and many others have in the past ten days have been sending and writing a lot of articles in terms of they call this a trunk of [00:21:00] openness.

Which to my extent I agree with that. So when I'm thinking about open, regardless if it's open source or open model it's the same thing. It's just like when you're holding a long spoon, right? If, imagine if everyone's having a long spoon. If you're in the hell, everyone is starving because they can't feed themselves.

But if you're in heaven, everyone is filled because they're helping each other. That's what I see as a positive externality. So I would say that, in the next, I would say, one or two years, I'm I'm expecting a balance, intrinsic balance across the global geolocation boundaries that there will be this kind of open group that they're feeding each other, and we're seeing this kind of fast accelerated process of technology advancement, where, at the same time, the people who are holding the strong models Anthropic and ChatGPT, if you might choose to stay open and really set up business model explorations over there.

I for one feels it's a pretty stable, I would say, two years explorations until it reaches the next plateau in which we might be seeing different behavior. Yeah.

**Guy Podjarny:** I would love to answer the gentleman's [00:22:00] question. I think I would love to see scale and contribution in open models. Open source AI has actually existed for a long time.

It's only, The only, and true open source with the data, with the code, with everything around it. Where it hasn't existed, and I think still doesn't exist, is in the truly large in the first L in the LLM, right? And and dealing with that in large scale. So that's one thing.

We're seeing things bigger. We also heard the open LLM France project earlier earlier on. That's a seven billion parameters. And so I think it's good to have those projects. I'd love to see them scale. And I think that would naturally care scale as in they run it for larger models, which does present some technical challenge.

But I think the real unlock is contribution and ability to fork. So if we get to that mode that like an open source, you're able to say, Hey, I take I like this model, but it is I want to have a different filter on the data. I want to have a different, or I can contribute. I want to add data. And, it opens up a whole bunch of questions about how do you curate that?

How do you handle the cost of training? Fine, they might have dropped, but they're not low. They're still in, [00:23:00] hundreds of thousands or millions of dollars. So it's not the same as I'm going to run this open source projects fork on my computer, right? It's more substantial. So how do we deal with all of those?

What's the curation process? What is the forking process? What is the relationship between them? I think these are all solvable problems. And I guess what I would hope to see is an entity, and again, the easiest to imagine, like some Linux foundation, that's the ideally that it's not national, although this has very much become a national conversations, but ideally like a Linux foundation style, a global consortium with industry type entity that tries to really lead foundation.

And I think if you tap into that, if you tap into the power of community contributions and things like that, you can beat any of the others. And it can actually become a foundations where a bunch of these become these Kubernetes. We talked one of the panels here is can there be a Kubernetes moment for AI?

I think this type of thing can be the foundation on top of which a lot of products are built. So

**Amanda Brock:** Can I just very quickly, because I, we do want to ask more questions, I think you're absolutely right, and I think forking, just that ability to keep the leadership in check to bring a community together and to go in different [00:24:00] directions, I think that's something that if we got to that point, it shifts everything, and I totally agree with you about this collaborative idea, this foundation for AI that allows everybody to collaborate and move forward, I think that's absolutely critical.

**Simon Maple:** Another question just here.

**Audience:** Not a tech question, but a legally question. So reading the mainstream press, they seem to say that DeepSeek took OpenAI's I. P. I presume that's what that means is that there's been some sort of model extraction attack and they've extracted the weights. A. Did I understand that correctly?

Seems not. And B. If I did understand it correctly, How was this model produced seemingly out of nowhere? What was the training data, et cetera, et cetera?

**Guy Podjarny:** I can take a kind of a first crack here in doing it. So the process of using another model to create your new model is a legit process called distillation.

And in fact, as Simon alluded to before, it was one of the perks that you got from having a frontier model. You'd build 4. 0 and then you [00:25:00] would use that to distill 4. 0 mini. Out of that, right? And it was one of the perks, one of the things you got in return doing it, what's been happening as these products became prevalent is people using other people's models for distillation with more or less permissions from from those entities.

And I think the primary kind of claim here is that there's been heavy distillation. I don't think, or at least I haven't heard any kind of indication that there was, breaking in and stealing of the weights of OpenAI, but OpenAI has a general, widely used, self serve API to access, so it's not that sophisticated, you don't need to be that sophisticated to leverage some basic off network to be able to effectively run a large volume of queries on top of that, and you might have dropped a few million dollars in OpenAI's lap in the process, they're probably not counted in that 5. 6 million but but you have a lot of those results. And then when you look at the open weights model, if you were to use Llama, for instance then you can do that. And in fact, that's what we're seeing now, is we're seeing one of the things Hugging Face is taking advantage of is while they don't have [00:26:00] access to the data, they have access to the model and the weight.

And so they can train this would be named o1 model as an open one, like the Hugging Face one, with the DeepSeek R1 results. So I think it is less a cyber security challenge and more of a legal ownership challenge of are you allowed or not allowed to do it. I think for the services it's not really a question, you are not allowed to do it.

Just the practicality is that you pretty much can't stop it unless you lock it behind closed doors, right? And even then you can fully block it, but then you can reduce the risk as you also diminish your market. I

**Amanda Brock:** think they they say on the, if you read the data on Hugging Face, I think it's very clear that they've used Llama and is it Qwen or Quek?

I always get this the wrong way around, but they've used two existing LLMs. They haven't disclosed that they've used OpenAI. I don't know. I can't comment on the legal side of it. I don't know, if they have or not, but they acknowledge that this is what they've done to get to this point and that's how they've reduced the training.

**Guy Podjarny:** And it's hard when you [00:27:00] cross legal regimes. To an extent, if OpenAI thinks Hugging Face did something different and they're both U. S. companies they can sue them within the U. S. legal system. When the same is true between a U. S. and a Chinese company, or really the U. S. and a British company are doing it, a bigger divide to the sort of U. S. and Chinese company that legal system doesn't protect you as much and so you're a little bit stuck.

**Richard Sikang Bian:** Putting this in perspective, first things first, they did actually disclose that. As part of the technical report for distillation, which is a choice. And there's nothing wrong with that, right?

Yeah. There's nothing wrong with that. It's such a choice. And plus, distillation to some extent is non magical, right? So we can so I'm reading some of those articles now. It's I got really perplexed. It's if distillation is really that magical, what's going on with the other companies?

Are they like, they don't know what they're doing? So I think it's basically if there's no enough evidence in, indicating that such a technique is going to be helping you to a certain level, I would consider those with a, I would say grain of salt in terms of how much help it does to the model itself.

But what they actually claim to be as part of distillation is the distillation process is going to be helping the reasoning [00:28:00] of the model, right? It will help train the smaller models to be a more reason, like more reasoning model. That's what it is, but it has nothing to do with the foundation model itself.

**Guy Podjarny:** I think it's important to note that the part of the cost savings and the speed and everything comes out of the fact that they can build on that distillation because they can use synthetic data produced from these frontier models to be able to do something that otherwise would have been Very expensive and time consuming with human feedback.

Eventually you're distilling human knowledge as well through this sort of human feedback. That's just a much more cumbersome and slow process. And so I do think it is significant and it's perceived that everybody's done distillation and it's just about the ratios. What it does mean longterm is that it just becomes that much more expensive to build a frontier model.

So you really are challenging the financial viability. of someone building a frontier model in which they cannot rely on, they will still leverage past models, but they need to invest dramatically more to build that model. Subsequently, because it's [00:29:00] being commoditized, they might not be able to charge per API call nearly as much as they could before.

And so how do they really justify? Which to an extent they do come back to like contributions working.

**Amanda Brock:** So there's a very good post last week on LinkedIn from Ben Brooks, talking about the or the document that allows you to understand how they did it, the manual, and how by giving that with the data, they're trying to fix some of these problems to show you how it's there, so go and do your own if you're worried on the IP front.

**Audience:** Hi. It's Anne Marie from TechInformed. I was just wondering, how Senator Josh Hawley's proposed decoupling AI from China Act, his proposed act might affect those working in open source. They're talking about fining companies millions of dollars and even individuals millions of dollars for collaborating with Chinese companies.

**Simon Maple:** Has anyone heard this?

**Amanda Brock:** I'm against it. I think that we should be focusing on global collaboration. I think it's how we innovate best and it's not [00:30:00] something that is good for society or for innovation.

**Guy Podjarny:** Yeah, I guess I would say it comes back to there's choice, right? Do you compete or do you try to restrict access?

Does the U. S. have sufficient financial force to deter people and, with enough punishment and such from our companies? From supporting it and does it produce some slowdown to to other companies doing it? Probably yes. Does it actually change the end result of it?

Probably no. And so I also, I think but I think it's a, an understandable response that is basically come from the same place that to begin with introduced the US chip restriction that has failed. I'm curious to see where the government gets probably the most unpredictable US government we've ever had and so we'll see we'll see where we go and the reality is the UK and others can make whatever choices they want, but really this really boils down to what the US will will decide.

It doesn't seem like a viable long term decision.

**Simon Maple:** And Richard, you're going to surprise us all by supporting it, I'm guessing.

**Richard Sikang Bian:** I'm going to surprise you by following the other ? But pushing into perspective, knowing a little bit about US legislation. I do believe in the US for 11 [00:31:00] years there's an intrinsic time gap between someone write a proposal to, it becomes legislation.

So I wouldn't be too considerate about it because nowadays internet too advance that, like when people write something, it becomes. It's way too obvious and I can understand why, like people can be like very concerned about it because in regarding to, but after it's only one senator writing a proposal at a very early stage I thought I really vouched against it because I've been living in many different countries.

I think that one thing which I really benefit from is like being in the cultural immersions of I had my high school in Singapore, I took GCE and I had my university in Canada and I've been working in the States before I came go back to China. I feel a lot of those cultural immersions will actually make me a better and a well known person.

I can't tell I'm smarter, but most of the time I'm actually consider as far as all of this kind of like dimensions or factors are considering. I definitely feel that those cultural immersions will help me and make me a well rounded person so that we can actually sit together and talk about all of that.

**Simon Maple:** All right. Wonderful. We're out of time, unfortunately, that I [00:32:00] know there are many other questions. . If you're online reach out podcast at tesla. io for questions from the podcast or join us on the Discord, the AI Native Dev discord. So let's continue the conversation because these are really important conversations to have and really important to ask these questions and get these answers. A big thank you to Richard, Amanda and Guypo. And thanks all for your your contributions. You're listening to the AI native dev brought to you by Tessl.