cpg-analysis

Deep code property graph analysis with Joern CPG (AST+CFG+PDG) and CodeQL for control flow, data flow, taint analysis, and security auditing

Quality

57%

Does it follow best practices?

Impact

—

No eval scenarios have been run

Securityby

Passed

No known issues

Optimize this skill with Tessl

npx tessl skill review --optimize ./skills/cpg-analysis/SKILL.md

CPG Analysis Skill

Purpose: Deep code analysis beyond AST. Use Joern for full Code Property Graph (control flow, data flow, program dependencies) and CodeQL for interprocedural taint analysis and vulnerability detection.

These are opt-in tools. They require Docker/JVM (Joern) or CodeQL CLI. Use codebase-memory-mcp (Tier 1, always-on) for everyday navigation. Use these for deep analysis when Tier 1 is not enough.

┌────────────────────────────────────────────────────────────────┐
│  CODE PROPERTY GRAPH = AST + CFG + CDG + DDG + PDG             │
│  ─────────────────────────────────────────────────────────────│
│  AST  = Abstract Syntax Tree (structure)                       │
│  CFG  = Control Flow Graph (execution paths)                   │
│  CDG  = Control Dependency Graph (conditional dependencies)    │
│  DDG  = Data Dependency Graph (data flow between statements)   │
│  PDG  = Program Dependency Graph (CDG + DDG combined)          │
│                                                                │
│  Tier 2 (Joern): Full CPG with 40+ query tools                │
│  Tier 3 (CodeQL): Interprocedural taint + security queries     │
└────────────────────────────────────────────────────────────────┘

Tier Selection Guide

Simple symbol lookup, dependency trace, blast radius?
  → Tier 1: codebase-memory-mcp (always on, sub-ms)

Control flow paths, data flow, dead code, complex refactoring?
  → Tier 2: Joern CPG (on-demand, seconds)

Security audit, taint analysis, vulnerability detection?
  → Tier 3: CodeQL (on-demand, seconds to minutes)

Full security review before release?
  → All three tiers in sequence

Tier 2: Joern CPG (CodeBadger MCP)

When to Use Joern

Scenario	Why Joern	Tier 1 Can't Do This
Trace data flow through functions	Full DDG traversal	Tier 1 has no data flow
Understanding control flow paths	CFG analysis with branch conditions	Tier 1 has no CFG
Finding dead/unreachable code	PDG reachability analysis	Tier 1 only detects unused exports
Complex refactoring impact	Cross-function dependency chains	Tier 1 limited to call graph
Auditing third-party library usage	Deep call chain traversal	Tier 1 stops at import boundary
Understanding exception flow	CFG includes throw/catch paths	Tier 1 ignores exceptions

Key MCP Tools (Joern/CodeBadger)

Tool	Purpose	Example Query
`generate_cpg`	Build CPG for project	First-time setup or after major changes
`get_cpg_status`	Check CPG build status	Verify CPG is ready before querying
`run_cpgql_query`	Run arbitrary CPGQL queries	`cpg.method("login").callOut.code.l`
`get_cpgql_syntax_help`	Query language reference	When unsure about query syntax
`get_cfg`	Control flow graph for a method	Understand execution paths in a function
`list_methods`	List all methods in project	Overview of available functions
`get_method_source`	Get source code of a method	Read specific function source
`list_calls`	List calls from/to a method	Caller/callee analysis
`get_call_graph`	Full call graph visualization	Understand call chains
`get_type_definition`	Type/class definitions	Understand type hierarchy

Supported Languages (Joern)

Java, Scala, C/C++, Python, JavaScript, TypeScript, PHP, Ruby, Go, Kotlin, Swift, Lua

Not supported: Rust (use CodeQL for Rust)

MCP Configuration (Joern)

{
  "mcpServers": {
    "codebadger": {
      "url": "http://localhost:4242/mcp",
      "type": "http"
    }
  }
}

Prerequisites

Docker (for Joern backend)
Python 3.10+ (for MCP server)
Install: ~/.claude/install-graph-tools.sh --joern

Common CPGQL Queries

// Find all methods that handle user input
cpg.method.where(_.parameter.name(".*input.*|.*request.*")).name.l

// Trace data flow from parameter to return
cpg.method("processPayment").parameter.reachableBy(cpg.method("processPayment").methodReturn).l

// Find methods with high cyclomatic complexity
cpg.method.where(_.controlStructure.size > 10).name.l

// Dead code: methods with no callers
cpg.method.where(_.callIn.size == 0).filter(_.name != "main").name.l

// Exception flow: methods that can throw but callers don't catch
cpg.method.where(_.ast.isThrow.size > 0).callIn.method.filter(_.ast.isTry.size == 0).name.l

Tier 3: CodeQL

When to Use CodeQL

Scenario	Why CodeQL	Other Tiers Can't Do This
Security audit before release	Interprocedural taint analysis	Joern has basic taint, CodeQL is deeper
Reviewing auth/payment code	Data flow from source to sink	Cross-function, cross-file taint
PR security review	Targeted vulnerability scan	Pre-built OWASP query packs
Compliance checking	CWE/OWASP pattern matching	Curated security query suites
Rust security analysis	Full Rust support	Joern doesn't support Rust

Key MCP Tools (CodeQL)

Tool	Purpose
`run_query`	Execute a CodeQL query against the database
`find_definitions`	Locate symbol definitions
`find_references`	Find all references to a symbol
`get_results`	Parse BQRS (Binary Query Result Sets)

Supported Languages (CodeQL)

C/C++, C#, Go, Java, Kotlin, JavaScript, TypeScript, Python, Ruby, Swift, Rust

MCP Configuration (CodeQL)

{
  "mcpServers": {
    "codeql": {
      "command": "codeql-mcp",
      "args": ["--database", ".code-graph/codeql-db"]
    }
  }
}

Prerequisites

CodeQL CLI (brew install codeql on macOS)
Install: ~/.claude/install-graph-tools.sh --codeql

Common CodeQL Patterns

// SQL injection: user input flows to SQL query
import python
from DataFlow::PathNode source, DataFlow::PathNode sink
where TaintTracking::hasFlowPath(source, sink)
  and source instanceof RemoteFlowSource
  and sink instanceof SqlExecution
select sink, source, sink, "SQL injection from $@.", source, "user input"

// Unvalidated redirect
from DataFlow::PathNode source, DataFlow::PathNode sink
where source instanceof RemoteFlowSource
  and sink instanceof RedirectSink
select sink, "Unvalidated redirect from user input"

Combined Workflow: Deep Analysis

When performing security review or complex refactoring, use all tiers:

1. SCOPE       → Tier 1: detect_changes / get_architecture
                 Identify files and modules in scope

2. STRUCTURE   → Tier 1: search_graph / trace_call_path
                 Map the call graph and dependencies

3. FLOW        → Tier 2: get_cfg / run_cpgql_query
                 Analyze control flow and data flow paths

4. SECURITY    → Tier 3: run_query with taint analysis
                 Check for vulnerabilities in data paths

5. REPORT      → Combine findings from all tiers
                 Prioritize: Critical > High > Medium > Low

Anti-Patterns

Anti-Pattern	Do This Instead
Using Joern/CodeQL for simple symbol lookup	Use Tier 1 `search_graph` (sub-ms vs seconds)
Running full CPG build on every commit	Build CPG on-demand; use Tier 1 for continuous monitoring
Querying Joern without checking `get_cpg_status`	Always verify CPG is built and current before querying
Running CodeQL without a specific security question	Have a hypothesis first; CodeQL queries are expensive
Ignoring Tier 1 blast radius before deep analysis	Always scope with Tier 1 first, then go deep on flagged areas
Using CodeQL for non-security structural queries	Use Joern CPGQL for structural/flow queries; CodeQL for security

Repository: alinaqi/claude-bootstrap
Commit: 7e5f7a2

Last updated: 3 days ago
Created: 3 days ago

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.