October 15, 2024

Can AI Tools Be Trusted with Security-Critical Code? Real World AI Security Risks with Liran Tal

In this engaging episode, Simon Maple and Liran Tal explore the dynamic intersection of AI and security, offering listeners a deep dive into how AI is reshaping security practices and what developers need to know to stay ahead.

Listen to the episode

Episode Description

Join Simon Maple as he hosts Liran Tal, a Developer Advocate at Snyk, in a fascinating exploration of AI's impact on security. Liran, with his extensive background in software development and application security, shares valuable insights into how AI influences security practices, both as a tool and a potential threat. The discussion covers a wide range of topics, from non-determinism in AI to the role of AI code assistants and the implications of LLM-generated responses. Liran offers practical advice for developers on integrating security into their workflows and balancing AI's capabilities with human oversight. This episode is a must-listen for developers interested in the evolving landscape of AI and security.

Resources Mentioned

Snyk Code: A SAST tool for static analysis

Chapters

[00:00:00] Introduction to AI and Security with Liran Tal

[00:01:00] Understanding Non-Determinism in AI

[00:03:00] The Role of AI in Developer Workflows

[00:05:00] AI Code Assistants and Security Risks

[00:09:00] LLM Responses and Security Implications

[00:12:00] AI Agents as Security Moderators

[00:16:00] Contextual Understanding in Security

[00:23:00] Supply Chain Security and AI

[00:28:00] The Future of AI in Security

[00:32:00] Summary and Next Steps

Full Script

Liran Tal: [00:00:00] The non determinism is also in a way, very confusing for security, like researchers and like attackers because it responds in a way, and as a researcher, or as an attacker, you want to have like a known input generates, like an expected response.

And you can tune it and find out like with fuzzing it, like what you're trying to get to, but if it changes all the time, it's like super hard to figure out if you're like catching it was it by mistake? Or like what exactly caused it? Because a lot of the time when you like, you do those like bug hunting and hacking, it's all like black box, you don't really see the code or whatever, the full input and output.

So there's, yeah, I agree. Non determinism into it is fun, but from. different perspectives.

Simon Maple: You're listening to the AI Native Dev brought to you by Tessl.[00:01:00]

On today's episode, we have an old friend of mine, Liran Tal from Snyk. And we're going to be discussing a number of things around security and talking about how security is affected by AI but also how AI can help our security posture, potentially relieve some of our threat vectors as well as obviously bring some new ones.

Liran Tal, how are you doing?

Liran Tal: All good, Simon. How are you?

Simon Maple: I'm doing very well. Thank you. And welcome to this episode. Tell us a little bit about yourself Liran, your background and and of course, a little bit about Snyk. I think I'm familiar with some of their work.

Liran Tal: I should teach you how to pronounce that's right. I don't know, I guess I'm recognized as the guy with the funny Yoda hat. But it doesn't really matter for you because it's a podcast and you can't see me, so you're lost. But no, seriously, I lead the developer advocacy team at Snyk. That's basically us building a developer security company, a developer first security company.

Basically means I am a software developer myself. I enjoy a [00:02:00] lot doing that kind of crossroads, the intersection of building developer products, finding security bugs researching them myself, building security tools doing all of those kinds of things, doing write ups and basically sharing that with other like developers in other communities and working groups I work with.

So that's a lot of fun thing.

Simon Maple: Yeah. And. So obviously I've full disclosure I've had about six years at Snyk. And so many of these questions I'll probably have my own opinions on but I want to hear from yourself mostly. And we'll talk a little bit about, various products where it makes sense to but in terms of when we think about security and how security is affected the developer workflow or rather security being included more in the developer workflow over time. How do you see AI as changing that dynamic? Do you see it maybe as potentially increasing the threat vector that occurs in an application at various stages of the workflow? Do you see it as a tool that can actually assist with that?

What's your take on that first?

Liran Tal: I think all of that, right? Because I think we're just like scratching the [00:03:00] surface of everyone trying to use AI in different ways. And specifically like executing LLMs with different tasks. So it's like interesting to see all the different use cases.

And I think like any tool it will be used and abused for different aspects of that. So there's definitely that. Also I think we're like at this point where we're seeing AI security issues, like across the board. Whether they are very like developer focused, like very deep integration and workflows of developer, like their practices.

But also like they're from like a consumer workflow end point, even if you like as a developer some examples of that, I think could be stuff like chats, which are like, from a consumer level, it's like super easy. And there's some fun stories. There are some incidents.

An airline refund to a bot. Recently one of the security companies were doing a research and disclosed like the Amazon review bot for their product pages was like abused. It's like a funny screenshot where it's like, hey, create me a react component.

It does this and that. And like the Amazon bot answers with that, with a code. And [00:04:00] by the way, someone then tweeted that exact experience happening over the Amazon app, so it's like they can like probably connect it at an api. So it's like super interesting and funny to see that how it's like such a massive impact as well.

So these are just two of them and we can like, talk through more, I think.

Simon Maple: And I guess, it's one of these things whereby there are you're right. There are so many that you've mentioned a couple there, but there are so many. And in terms of a developer using AI, I guess there's a couple of ways in which developers can use AI today.

They can include AI into their applications. So they're effectively reaching out to an LLM to build Perform some processing or business operation before providing an output back to a user or they can use AI as part of their workflow. And so maybe using AI to empower them during writing code or some other part of the CICD or code reviews, whatever it is, where do you feel?

And you mentioned what, some of what you were talking about there, like leans more into the some of the threat vectors of using an LLM in your application. If we were to lean more into the developer side and actually how developers today using [00:05:00] AI to improve their workflow and make themselves more effective, more efficient developers, maybe talking about things like, your copilots or your tab nines, those kinds of things.

Liran Tal: Are you saying that developers use like code assistants and do not write all the code on their own? Are you saying that?

Simon Maple: I know you're a JavaScript developer yourself and we know how rife that ecosystem is with vulnerabilities. Anything's got to be better than that, we've got to, we've got to improve here Liran, but what would you say like the number, the top one or two risks for a developer that is using AI in their process today?

Liran Tal: That's I think it's going to be like a hard choice to choose from the top one or two. I think if we boil it down, like you said, there's like several workflows that developers take.

If we name a few, and we can talk about like some examples. So if you like use AI code assistance tools like in your IDE to get code done. Then essentially I think like your colleague or your peer, like it's just someone suggesting you to write, a particular line of code or a given pattern of code [00:06:00] that you were given by the AI coding assistants.

And, you can like auto complete it. You're like tunneled vision onto the business use case, the story, the bug fix. And I think you're missing out on the point of being more judgmental and like thinking what are the concerns here in terms of like security and other aspects, right?

That is an obvious. It's like a red flag thing that, developers should have. A I completed this line of code for me. And at that moment you should not just try to understand what it did and get a grip on that but also did it do it in like an insecure way, like what are the security issues and vulnerabilities that it added just by, just by completing that code.

And if you can compare that like to the time where millions of years ago when we had this like small website called Stack Overflow, right? And I'm not saying I used it and I'm not saying you used it, Simon, to copy and paste code. I'm just suggesting that, maybe that happened, maybe it didn't.

But when it did, I think you were a bit more like concerned and judgmental on what was going on and you're like, should I copy paste it? Should I not? What exactly does it do? [00:07:00] I think a bit more because it wasn't like a zero friction effort It was it had a friction to it and I think that kind of made you think twice on what you were doing and that kind of case I think one of the security concerns are like just using coding assistants because they're like so inherently added to your workflow in just a seamless, transparent, like easy way.

Literally just hit the tab, kick auto complete and the code is there, like ship it to production.

Simon Maple: Is that different? It's a really interesting parallel to draw, but is that different? Do you think a developer would truly look through code on Stack Overflow and wait to understand it before they used it?

Or do you think they'll just look at something and go that looks reasonable, copy paste straight into the app and then try it. Cause it tends to be how a lot of people use copilot today, right? They might see something that copilot spits out and go yeah that looks reasonable, not even in any detail.

It's calling a couple of the functions that I might expect it to even just try it. And that's the fastest way to work out if it works so much, isn't it same?

Liran Tal: I think it's not the same. I think the difference lies in the fact that there is friction in copying [00:08:00] code from Stack Overflow versus hitting the tab key on, and like how easy it is to just do that on the IDE.

So for example, when you're visually, when you're on like a Stack Overflow page, There are other answers there, right? Like on the page, some say no, some say yes. There's like upvotes and downvotes. Maybe you're like more judgmental, like saying, Hey, you know what? Let's look at the right answer.

Like what was wrong in the other answers? You're like, I think there's like more friction to just accept just copy paste it. I'm not saying that no one did that. Like just copy pasted blindly and did it. Of course people did that as well. But I'm saying it's because it wasn't as easy and seamless. It added some, I think some like minor psychological barriers that would get you to think like one, two, three seconds before you did it.

Maybe it was still, like easy, but not to the extent that you literally can just prompt the AI on your LLM complete this test or something. It does it, you accept it and that's it. No more thinking.

Simon Maple: Yeah, it's almost security through obscurity, but in a slightly different way.

So yeah, code auto [00:09:00] completion tools. Then we mentioned LLM responses in terms of sometimes the nonsense or the incorrect things or even hallucinations that it could throw it back at you. Anything else that you wanted to like highlight as some of the biggest problems?

Liran Tal: I think that is too. So if we can summarize and say, code autocomplete tools have that inherent vulnerability where they might like suggest insecure code. And we take that parallel and say yeah let's talk about now LLM responses, the issue there is if you can flip the whole script around and you can like trust the LLM.

So imagine there's like an application where you build on the LLM. Just like a chat or something else and you want to have that data back from the LLM response to the user that interacted with it and save it to the database, right? And just as an audit trail, nothing else you save, you do not save identifiable information, not the user and what they asked, nothing, just the LLM response.

And I think developers might make that mistake where they trust the LLM as just some random source of, input but not user input. It's just like data that gets flown and it's from OpenAI or Anthropic or whatever. It's like, [00:10:00] why would I not just save it to the database?

That's fine. But what if part of my conversation, what if my part of my interaction as an attacker, as someone who interacts with a system without LLM in a way, I'm able to prompt injected to specifically make it respond back in a way that it's response text literally has like a Bobby table, SQL query there.

And the moment that you can like maybe save that LLM response in an insecure way in an application would literally translate into an SQL injection. If you go back to what you said, like LLM responses, I think looking at them as a potential taint source in this kind of security lingo of source to sink is like one other observation that I think often easily escaped us but it's just right there.

Simon Maple: I hear some people say okay you can use LLM sometimes as a form of safety net as well maybe using coding agents or AI agents effectively to say look I'm going to have a code or I'm going to have a code reviewer, I'm going to have a security agent and so forth and they're effectively looking at the code changes or whatever, and [00:11:00] they're providing their input as a security expert or as a performance expert or code reviewer and they might say, okay, I recognize this as a drop table. I recognize this as a potentially malicious input or whatever. How good are they? Would you say at being that blocker to identify those types of issues or maybe even if you're using an AI agent to write some code how good are they assessing whether the vulnerabilities exist in that code before they can be pushed?

Liran Tal: So I think that's like a valid route to take. That's like an assessment that is, I think you could take it. But the moment you just mentioned the code agents that comes with its own potential security considerations. So if you go back to OWASP Top 10 LLMs which I know you've ventured into that Simon and you have a lot of experience there.

So you probably remember things like excessive agency and overprivileged access to sensitive systems. And I think especially if you are building in house like coding agents that do those kinds of things, like moderate content, add security barriers or whatever, you're even like more likely to make mistakes that fall [00:12:00] into things like excessive agency and overprivileged access because I think if you do that in house, you probably do not understand the security framing and the security risks that are involved with those. And there's like a bunch of cases where if you go back there's just like databases right now that actually just like CVEs but they track vulnerabilities disclosed to libraries or projects.

So for example there's a really popular project for coding agents a while back, maybe a year back called auto GPT. Give it a task. It just like a website, build a website. It goes off, runs its own agents, figures out, reasoning what it needs to do.

Learn HTML. It learns HTML. Build CSS, whatever. There's a vulnerability there that actually you could exploit if you just add a prompt injection in one of those sources of the website or whatever it interacts with that would create like a command execution at the system.

It also existed for like LangChains, like libraries become a vulnerable aspect as well. Again if you give that a coding agent, like access to systems to interact with APIs, then you can imagine where this is going. Like the [00:13:00] whole flow is like a lot bigger now, but it's also interesting.

Okay. So someone interacts with his coding agent to like moderate content whatever maybe it does that but maybe they also have like access to moderate the content to like some sensitive system to an API call to make sure that content doesn't exist in that system or whatever.

But then it gets like completely diverted into doing something else, like spitting out all that content back and things like that. So it was like a bunch of really interesting flows in cases where coding agents really become a problem.

Simon Maple: Yeah, really interesting. And actually one thing that kind of listening to you there reminded me of something, which is a quote actually from a previous episode, which Guy recently had with Caleb Sima.

And they talked about how results, if you are using an AI tool to review, like solely relying on the LLM output or review of some code to be able to tell you if there are vulnerabilities or not that exist in some code. One of the biggest [00:14:00] problems there isn't just that. It may give you a right answer.

It may give you a wrong answer. The bigger not bigger problem necessarily, although it would be good to discuss. Another problem there is it could run that 10 times and you will get different answers every time that non determinism is actually a very interesting problem because if you was to run a test make a change, you run it again. And it says, there's no problem here. Is that just because the result is non deterministic and it didn't catch it this time, or have you actually caught it? And what is actually the worst problem sometimes to have? Because that non determinism can be extremely frustrating and can leave some gaps as well.

So is that something you have seen? With kinda like using AI as a tool to identify security issues?

Liran Tal: I think it's quite worth calling out like the non-deterministic way that exists in these LLM responses is like literally by design or designed to randomly pick at different kind like a layer reasoning of the model, the statistical [00:15:00] model and resolve something back. Otherwise it literally just like a big if all statement,

Simon Maple: I thought that was what AI was. No?

Liran Tal: The AI can do that too. It's okay. Won't hold it against them. So what I found interesting in that case is so not as a moderator, like not as a coding assistant experience I had but like a different one where I was chatting with the LLM.

And I often do those like live hacks where I like chat, like real chat with the LLMs and like real interactions. And I don't do them like recorded videos. I always like figure out like, yeah, I can test it like five minutes before but I have no idea if it will work like with the sixth minute on the demo.

One of the cases where it's I was prompting it, like create an output that would be like a cross site scripting because like it gets added onto the page, like an insecure way. And my prompt like didn't work. And I was like, ah, it's not working. Damn it.

I'll try a different one. I was like, I'll try a different one. Like the third one worked, but yeah, it's like you can't figure out the responses. And it's interesting because the non determinism is also like in a way, like very confusing for security researchers and like attackers because it responds in a way.

And you [00:16:00] look as a researcher, you're always like as an attacker, you want to have like a known input generates like an expected response and you can tune it and find out, like with fuzzing it, like what you actually need to get to. But if it changes all the time, it's like super hard to figure out if you are like, catching it.

Was it by mistake or like what exactly caused it? Because a lot of the time when you you do those like bug hunting and hacking, it's all like black box. You don't really see the code or whatever, the full output and input and output. So , yeah I agree like non-determinism into it is fun but from different perspectives.

Simon Maple: Yeah, absolutely. And you mentioned prompting there and how changing the prompt slightly or rerunning something from a developer point of view. If I'm in my IDE how much does it honor the fact that I asked for something in a secure way or not a secure way?

If I want to say, provide me a secure way of doing some path traversal, does it respect that? Or is there a way I can craft a prompt to actually provide me, increase my chances of getting something more secure?

Liran Tal: Does not work at all from the experiments that we have done, like at all.

And it's funny because we again [00:17:00] recorded this live demos. Brian Clark on my team is like running a lot of the YouTube stuff for Snyk. And he had those sessions where the AI is gonna get me fired in those youtube sessions because he's literally like installing different AI coding assistants and all of them get it wrong and like he's prompting it as fully hey, please create a simple node. js web app that allows remote user like allow users like create, update delete stuff and it goes and say, hey, and so security is important. I will be deploying this in the real world. And my job is to ensure it's like a hundred percent safe from any security issues.

If it isn't, I will be fired. And please take this seriously. Three exclamation mark, hit generate the code generated. You run Snyk on it like a minute later. It's well, it has vulnerabilities in it, no SQL injections, open redirects, whatever. And we tried it with different cases.

I do not know to say that it always completes it in an insecure way, but definitely we did not need to like make it like a hundred takes to figure out that this is what it does literally on the first takes. You can try it.

Simon Maple: So what you're [00:18:00] saying is there's just too much existing insecure JavaScript code for it to train them.

I like that. I'm saying that tongue in cheek.

Yes. As a java person who's enjoyed trolling you for the last 20 years, yeah.

Yeah.

Liran Tal: Too many vulnerabilities in npm packages. I'm with you.

Simon Maple: What about asking an LLM to, if we can point out an issue with the code, so let's say there's a cross site scripting.

If I say to an LLM, there's a cross site scripting issue in my code here, can remediate this issue or provide me with some code that fixes this? Is that something we can trust or is that something that will give us a better chance of getting some secure code?

Liran Tal: So I think you can definitely try it and I'm not saying you should not. But I don't think it's like we've seen cases where you could like completely trust that there's like you said there's like a hundred percent of all the cases that you tried worked. I have one example where it's not exactly prompting it to make it secure, it's a bit different. So for example, I often demonstrate for, if you do like a front end project and you use react and it has this directive in the API of [00:19:00] react that says dangerously set inner HTML and you can put whatever HTML you want. And that's what it's going to get on the page.

Now this is like an escape hatch that sometimes developers and libraries need to use for different reasons. Okay. Yeah. But it's properly named dangerously set in our HTML. And so what's the expectation floater is developers use this escape hatch and then they go a few lines before after and they create a new function that says, function like the function definition escape XSS or sanitize cross site scripting, dangerously set, blah, blah, blah.

And they put a string, they hit the return key and they expect the LLM to to auto complete the basically like the escaping, the proper way to code it. And that's the workflow you would want. The thing is It doesn't matter which and I've done that experiment, like different times, and it doesn't matter which format of the code, like the code pattern that it gives you to do the XSS escaping.

It's always wrong and right at the same time. Now what I mean by that is the code it gives you to do output escaping for cross site scripting in an [00:20:00] HTML element is always correct. But if the input. Of the user flows into an attribute of the HTML, none of those code patterns for escaping that the LLM gives you is correct because to escape attributes, you have to escape other characters that are not part of the HTML.

So this is like where security becomes like very, this like very nuanced and subtle thing where you have to understand that context of output and coding within different mediums of like output into HTML output as a JSON on the page, output into an attribute and to a CSS, they all need to be escaped differently.

And the mistake is not the LLM here, but you, because you still did escape XSS but it doesn't really know any better. It just gives you the output. So it's correct. But at the same time, it's also wrong and you will have an XSS vulnerability there. So back to what you were saying, it's a tricky question because I don't know if I trust it enough to understand all the permutations of user input where it can flow [00:21:00] the business logic of the data goes there.

And to craft a specific secure code and like output and scaping and sanitization or whatever into what I would want it to do or what I would hope it to be secure.

Simon Maple: Really interesting. And so there's a couple of things that this kind of reminds me of one is I had a chat with Omer Rosenbaum, who is the CTO of Swimm, which is a tool that generates documentation based on code flows and things like this. And one of the things that they mentioned was that if he was to only use an LLM to understand and try and learn what the code is doing, to be able to generate code 80 percent of the time, I think he said it would fail, it wouldn't provide good documentation.

And so they balance their tools so that. They do an amount of static analysis on the code to understand the flows and effectively, under the covers, create an AST, which they can then use with the LLM to be able to describe in some of the parts and flows to be [00:22:00] able to provide a better level of documentation to the user.

Now, like I said, I'm familiar with some of the Snyk's work. This is very similar to what a Snyk code, which is a SAST tool does. Under the covers to identify that. And, thinking back to what you just mentioned about context, this is the key part, right? It's about understanding where vulnerabilities are truly present because of the contextual, where the flaws exist in the code.

Not just the fact that it might look or might not look like a vulnerability just based on examples outside can you tell us a little bit about the importance of that? And I know that's what Snyk do but tell us a little bit more about, about how useful that is.

Liran Tal: Right? It's immensely useful because it's basically if you want to simplify that imagine you're like labeling data, right?

Like labeling data that is code. And if your way to like label data, like your classifiers are naive or you're labeling it, literally labeling it wrong, right? Like you're looking at a piece of code and [00:23:00] say, okay, this doesn't have vulnerabilities.

Then your training data, like what you'll spit out afterwards is not being able to detect what vulnerabilities are or like fix them to that extent. And that's like where a company like Snyk, like it takes it to like, Hey, we're going to build this machine learning engine that's actually like understands code flow.

Actually, I could read a lot of the data of code that exists outside, but beyond all of that, beyond like the code paths engine itself, it actually has because this is what Snyk does. This actually has security analysts who have reviewed vulnerable code, triaged it, know to say this code is vulnerable.

This function is vulnerable. And this line of code is the one that is specifically like not doing what it should. That's, maybe that you get like a bad traversal because you use like an API that's like very naive, like a bad join and then giving it two strings instead of like correctly building strings and not like concatenating one to the other.

And I think to an extent because Snyk has that knowledge and that deep reach, like [00:24:00] awareness of like a fountain of security expertise that you could pair with the AI, which sounds like what Omer was like saying about the docs, it's like the way to be able to build something that is a lot more, not just deterministic, I would say, but also the data that it is gives you, you're confident in like the high level accuracy of the data you get back.

Simon Maple: Yeah, really interesting. And in terms of AI, I guess how you as a vendor with a vendor hat on how Snyk and I guess outside of Snyk as well how can people actually use AI most accurately or most effectively from a defensive point of view in terms of obviously there's secure security tooling and things like that, but are there additional processes and things like that, that we can add into our workflow to say, irrespective of what I'm using, whether it's a copilot or manual coding or whatever it is to generate code or write code manually, this is how I can use AI in my workflow to improve the accuracy and the the security of my code?

Liran Tal: Snyk has like IDE [00:25:00] extensions. There's like a whole product called Snyk Code, which is a SAST tool. In other words, it's basically a static analysis product. And what it does is, so the moment you can like, copy paste that code from ChatGPT completed, whatever and you like hit command S, which I like to call secure on safe, not.

I don't know how you feel about it, but I like that term.

Simon Maple: SOS. I like it.

Liran Tal: There we go. The moment you do it, it goes through the code, analyzes it and gives you that response back in seconds worth of time. And it doesn't require you to go through a process where you build a project, which, I know you Java people enjoy doing that because you get long coffee breaks and stuff like that.

But we TypeScript people also sometimes go through that process and Snyk Code doesn't require you to do that. So if you use that, you can use this engine, this static analysis engine within PR checks and UI and whatever it's CIs and whatnot. But if you use that at like the IDE level, you basically get very transparent seamless code security unrelated to whatever you use to generate the code, whether that's like copy paste from [00:26:00] ChatGPT, whether that's like a codeum, Dot Nine, whatever other coding assistant you use doesn't really matter. The moment you can like paste that code to go analyze and give you that results.

And there's an interesting aspect there, there's like another like layer of AI aspect into it, which is not just detecting the code. And highlighting it and giving you that squiggly red line under, in the IDE under the line, which is pretty cool, but also there's a fixed aspect to it as well.

So you could literally just say, the code is annotated and you can say, okay, yeah, fix this vulnerability. And the way that that was implemented is also with a lot of thought process into that and the way that it works. So for example, and a fix which is by the way, it's like super hard to like as well do a fix because it requires like so much context, like to the point where you have to refactor an application.

But still at that point. When you can fix something with Snyk. It will fix it. It will rescan it to make sure that it did not really add the code changes that were like the output of the fix did not really add another [00:27:00] vulnerability or the same one. And it also, because the engine goes through the code itself.

You also get the confidence that you could like literally just build a project and it will just run. And we didn't really break anything because the AST, the abstract syntax rethink compiles to something. So there's like a lot that goes into ensuring like a really good de exter.

Simon Maple: think that added verification, that validation allows you to actually, have some level of trust that the changes that have been made, isn't just an LLM like you say, throwing something in a database and leaving it. It's making a suggestion that you can then test and have some level of confidence in before accepting.

Yeah. Yeah. And we talked briefly there about libraries a little bit during this session as well let's talk a little bit about third party libraries. And I think one thing is if we were to ask for a code suggestion every now and then it might suggest, Hey let's use this library.

Do you think it's fair to say that because of the nature as to how LLMs are trained to effectively [00:28:00] use patterns that it sees all over the place. Libraries that are most popular are more likely going to be the libraries that it suggests. So is there some form of saying using these libraries are mostly going to be more likely acceptable because a wider community is using them, or is there a path there to pulling in more insecure libraries as well?

I guess that's a very loaded question because that's assuming popular libraries are less likely to be insecure.

Liran Tal: Also true. Yeah, I get that. I agree. There's already research out there from a year back that showed that you would like prompt the LLM and said, Hey, we're integrating with this database called the RangoDB.

Could you please give me like, the command to install an NPM, to use the SDK, build a project with it and it gave you three options, very nice of the LLM to give you also several options too very friendly. One of them did not exist, right?

So like one of them literally, if you like did an NPM install, went to the website, to the registry, like search, or you get a 404. And so you can like, Yeah, that's, [00:29:00] it had it wrong. And, you could say, okay, not that bad. I'll just choose one of the other three that it gave me. But if you can reverse engineer the way that the attacker is now think about it is, yeah, the LLM is gonna hallucinate, generate something that doesn't exist.

And as an attacker, I'll go ahead and create all the permutations I can think of those libraries, I'll. Ahead of time, publish them to different, registries. I'll make them functional ,install, but with a backdoor or Trojan or whatever I want in there. And, there's going to be some user at some enterprise company installing it.

And at that point, it's like game over. I'm like, I'm in their system. And you get to be able to breach a company because AI had hallucinated something. And someone thought of that before. So there's that risk that exists as well and it's a real risk.

Simon Maple: From that would you assert that these are the types of changes that need the most like attended validation from a human, from someone to actually look at this and say that I recognize you're trying to increase my supply [00:30:00] chain here by pulling in whatever libraries or tree branch of libraries and transcripts from that.

This is something I need to check out myself always before I'm adding into my project. Is that something, is that how you look at this?

Liran Tal: So I think it just, it extends like the surface of what you get from the LLM to something else that you didn't maybe think maybe you're not like thinking of it as that important, but it is right.

So if it gave you like something very discreet, function 10 lines of code and what it does is just, some code that you could vet, that's fine. But the moment it that gives you like, hey, here's like a functional application or some server side API route, but I imported like five different dependencies to do validation, to do upload, to do whatever.

Then at that point, you're like, you're not just adding some 20 lines of code. You're like literally adding more lines of code, maybe hundreds, maybe more because without talking about any of them being malicious, just the fact that it really suggested more dependencies. So potential issues like, licensing coming again, and it really just extends and creates a bigger attack surface because it [00:31:00] extended it from like just completing code to completing code and giving you a bigger supply chain security concern that you now need to be worried about.

And honestly, it's not just about the fact where something there is malicious or not. It could also be the fact that it's just like literally outdated or like really badly maintained. So if there's a vulnerability tomorrow, like who maintains a fix for that, you'll get stuck. Or even if at all, you know about it, right?

There's like a whole other aspect to it as well. So I think from that aspect, it's like a bigger issue.

Simon Maple: Yeah, no, very interesting. And I want to get onto the tech to the deeper dive of this this episode, which is going to be the part two, which will only be on YouTube, and this will be the screen share version where we actually take a look at some of these issues and on a live demo, but before we go, while on the third party library aspects of this and the supply chain, obviously Snyk as a company started off in the open source security space, looking at third party libraries and so forth with with Snyk open source, in terms of using AI or any kind of ability with an LLM [00:32:00] here, are there aspects on the open source side in the vulnerability database and things like that, that make use of AI as well, from the point of view of a vendor using AI to be more resilient now.

Liran Tal: Yeah, definitely. I think it's a helpful way to like label data. So like we could use that to, if you give it like a CVE, for example, like a record of vulnerabilities, it's a lot easier to just, instead of reading that, line by line going through the code or whatever it's, you could feed that into an LLM and get labeling, right?

Like what's the CVSS, how would it impact, build the whole vector thing classify what's like the CWE that's related to that, potentially build a tree. So there's like different ways in which we would use AI to help us from the vulnerability database point of view specifically to make it more, I would say transient for us just get the data first from the LLM and then build onto that and triage that.

But also I'd like to hear you what do you think the usage perspective to that would be?

Simon Maple: Yeah, it's a great question. I think it was a great question. I think I asked it you first, right? So it must be a great question. Yeah, no, absolutely. I think [00:33:00] that's an important part.

And I think, one of the other things which kind of always struck me is in the identification of issues in third parties, whereby very often third parties, for whatever reason, don't mention that they fixed a bug or an issue, security issue potentially in their library, Yeah. idetifying when these issues have been fixed, just a CVE hasn't been created for whatever reason.

Maybe they're not a CNA. Maybe they don't know the process to create the CVE. Oh, yeah. And I think there was some machine learning style code that was added to recognize, trying to find other examples of similar code changes or descriptions even in comments and things like that to identify when you need to upgrade because there's a security issue, even though the maintainer hasn't actually acknowledged that fixed a security issue.

So you still have problems in your supply chain that you need to eliminate. But it's just not in a public list that is addressable from a tool, for example. So I think that was something I always remembered from that point of view, which [00:34:00] really helped.

And I think it's been in it from from the early stages of Snyk as well.

Liran Tal: Yeah. That's been super helpful too. That's a good point.

Simon Maple: Yeah. Yeah. Which I think I used to remember the stat, but I think in the node space specifically, or the JavaScript space, rather the vast majority were like that, right?

The vast majority of issues, vulnerabilities weren't added to the public CVE database. They were actually just fixed without a CVE being created. And so it was one of the really important things to get a kind of more of these, more carefully curated databases to get that level of depth.

Liran Tal: I think in hindsight also I don't know how much people are like aware of the CVE crisis for the security space, but there's a whole CVE crisis where there's like organizations that handle the reporting or like the handling of CVE reports, like of like security disclosures.

And they've been like massively understaffed. And with like thousands of vulnerabilities of like backlog that they have to field through. And so what's happening now is that there's potentially like a ton of applications or [00:35:00] dependent third party, components, whatever you want to, in different aspects, depending where you're like using it but they're like vulnerable and potentially no one knows about it.

So in hindsight, I think investing like Snyk investing in this, like trying to catch also unlisted CVE's those that are like you said they just happen in like a repo. Someone reports them, then fixes them.

That's it. Like they didn't even know what the CVE is. These are just developers, Snyk catching that and saying, oh no, that's not just some commit fixing a bug. That's like literally a commit fixing a security issue. And we can pinpoint like the version that was that is now super, super helpful to have that information and that bigger database.

Simon Maple: Yeah, absolutely. And of course, we should be mindful as well that it's just the same things that an attacker can do and identify those types of issues. And they recognize the window between that being created and a CVE being raised as time when they can actually perform attacks on that without expecting organizations to be rushing to fix those types of issues in their stack. Liran we could talk about this for a long time.

And I'm [00:36:00] thinking let's try and keep this to an episode under a day. Yeah let's jump into the into the demo straight after this, but for now, for this part of the episode for those who are listening not on YouTube, we'll say, thank you very much for listening.

And a big thank you to Liran for joining us today. And yeah, tune into the YouTube part of this session as well and we'll go into a deep dive and show you some screen sharing and maybe some vulnerabilities and fixes and things like that on there as well. So Liran, thank you so much for your times and insights into this very interesting topic around AI and security.

It's been a pleasure.

Thanks for tuning in. Join us next time on the AI Native Dev brought to you by Tessl.

Podcast theme music by Transistor.fm. Learn how to start a podcast here.

Can AI Tools Be Trusted with Security-Critical Code? Real World AI Security Risks with Liran Tal

Episode Description

Resources Mentioned

Chapters

Full Script

Be the first to try Tessl

You’re signed up!