Skills on Tessl: a developer-grade package manager for agent skillsLearn more
Logo
Back to articlesWhat AI Can (and Can't) Do for Your Security Workflow

16 Jan 202610 minute read

Joseph Katsioloudes

Joseph Katsioloudes

Joseph is a cybersecurity and AI leader who helps developers build securely. His open source game has reached 10K+ developers, his videos 2.8M+ viewers, and he has spoken in 25 countries.

The conversation around AI and security tends to swing between two extremes. On one side, you have the believers who think AI will catch every vulnerability before it ships. On the other hand, the skeptics who dismiss it entirely. The truth, as usual, lives somewhere in the middle. I recently had the opportunity to share my framework for navigating this space at AI Native DevCon Fall 2025, and I wanted to expand on those ideas here.

joseph posing

The key insight I've arrived at after extensive testing: in cybersecurity, we don't have a detection problem. We have a fixing problem.

The Reality of AI-Powered Vulnerability Detection

I opened my DevCon talk with a demonstration that should humble anyone betting too heavily on AI for security detection. I ran a straightforward prompt through GitHub Copilot: "list the security issues in the code and propose code to fix each of them."

The results were telling. Copilot correctly identified the SQL injection vulnerability. But it also flagged passwords stored in plain text, a problem that didn't actually exist in the code. This is hallucination in action, and it's the first major challenge with relying on AI for security detection.

The second challenge proved even more concerning: non-determinism. When I ran the same analysis against a full codebase, Copilot reported 8 issues, including prototype pollution. As a security specialist, I was intrigued since I'd never seen AI detect this particular vulnerability before. I opened the specific file and ran the identical prompt. Prototype pollution vanished from the results. Same code, same model, different outputs.

Switching to a different model didn't solve it. The fundamental issue isn't which model you choose. These AI models are designed to predict the next token – that in our case are coding keywords – and not systematically analyzing data flow and taint tracking like purpose-built security tools.

Where AI Actually Delivers Value

Here's where things get interesting.

While AI falls short on detection, it excels at fixing. This asymmetry makes sense when you think about it: detection requires comprehensive analysis and deterministic results, while fixing is more constrained, working from a known vulnerability toward a known pattern.

For this purpose, we’ve demonstrated Copilot Autofix, an AI agent that helps developers at companies like Asurion and Otto Group fix vulnerabilities in their code over three times faster than it would take them on their own. GitHub Advanced Security transformed the industry toward “found means fixed” with the power of AI, reducing mean time to remediation by 60% and enabling teams to fix vulnerabilities 3x faster.

The workflow I demonstrated at DevCon represents a hybrid approach that plays to each tool's strengths. Code scanning tools, the purpose-built security analyzers, handle detection. They have taint tracking capabilities and data flow analysis. AI then steps in to propose fixes based on those deterministic findings. You get high-confidence, deterministic detection paired with AI-assisted remediation.

This progress is definitely a step forward in bridging the gap of 1 application security specialist for every 100 developers. However, it’s crucial to remember that you should always test all your code, including the code that has been suggested to you. This is because GitHub Copilot is a copilot and you should be acting as the pilot.

MCPs and the Security Integration Challenge

The promise is compelling: connect AI agents to real-time server-side data like security alerts, code scanning results, and secret detection findings. Instead of working from static knowledge, your coding assistant in the IDE could pull your actual security findings into the conversation.

But I'll be candid about the current limitations.

The MCP protocol is synchronous, which creates poor user experiences when integrated with static application security testing tools. We've been experimenting on this integration internally, and the experience isn't there yet. It's slow, and the orchestration between AI agents and security tooling needs improvement before we can release something we're proud of.

There's also the trust dimension. As a security specialist, it's difficult for me to trust an MCP server coming from a project or company I don't already use. Many servers request overly broad permissions, creating exposure risk for our internal context, while indirect prompt injection remains a real concern when external, malicious prompts, get executed from AI.

Supply Chain Decisions at AI Speed

One of the most practical outputs I wanted to share at DevCon was a structured approach to supply chain security research. Every time you introduce a new package, you're expanding your attack surface. Yet when I asked the audience how many spend more than 10-15 minutes evaluating a dependency before adopting it, nobody raised their hand.

Our team at GitHub built a set of instruction files that prompt AI to perform structured security and community health assessments of packages. The output gives you an executive summary with categories flagged for attention, plus source URLs for verification. We extensively tested the system to minimize hallucinations, and the multi-file approach proved essential for achieving reliable quality. You can find those instructions at: gh.io/ai-dep-risk

This kind of structured prompting represents a pattern worth adopting. Rather than asking open-ended questions about security, you constrain the AI to check specific criteria against verifiable sources. You can then visit the sources yourself to verify. The instruction files are freely available in the above URL, and I'd encourage you to modify them to your liking.

Security Guidance Without the Security Specialist

Perhaps the most democratizing application I demonstrated was using AI to answer the question every project maintainer asks: “How can someone hack my project?”

In November 2024 GitHub announced a Secure Open Source Fund supporting 125 projects with $1.25 million in funding. Every project that receives our help comes with the same question: what's my attack surface? These projects are fortunate to get access to specialists. But what about all the other maintainers who don't have that access?

I showed how AI can generate a shortlist of attack vectors for any repository by asking the same question to GitHub Copilot on the web. You can do so by navigating to an open source project, such as Bootstrap, and click the GitHub Copilot icon at the top right of the page. For Bootstrap, it flagged considerations like misusing sidebars as this may introduce XSS vulnerabilities. As a security specialist, producing this kind of shortlist would require reading thousands of lines of code. AI delivers it in seconds and it’s a good starting point for a review.

This isn't meant to replace a security audit. But it helps minimize a gap that's otherwise unfillable. A maintainer can get a starting point for security considerations in minutes rather than never addressing them at all. Here it’s important to remember that in the same vein that you, the developer, are the pilot, then when it comes to security, a security professional is the pilot.

The Takeaway: Use AI Where It's Strong

My framework comes down to understanding asymmetric strengths. AI shouldn't be your only safety net for detection.

It hallucinates, it's non-deterministic, and it lacks the systematic analysis capabilities of purpose-built security tools. Always follow good security hygiene.

But for fixing known issues, accelerating research, generating tailored guidance, and democratizing security knowledge, AI proves genuinely useful. The developers who ship more secure code are those who combine deterministic detection tools with AI-assisted remediation and learning.

I've made available the secure code game repository (gh.io/scg) as a playground to experiment with these concepts. It has three seasons of challenges we developed, and we received fantastic feedback from the DevCon workshop. Try it before applying these techniques to production code.

If you want to see the full presentation, you can watch the complete AI Native DevCon talk here.

Join Our Newsletter

Be the first to hear about events, news and product updates from AI Native Dev.