or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

docs

cicd-integration.mdconfiguration.mdcore-scanning.mderror-handling.mdindex.mdoutput-formatting.mdrules-matches.mdtarget-management.md
tile.json

tessl/pypi-semgrep

Lightweight static analysis for many languages with programmatic Python API for custom integrations.

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
pypipkg:pypi/semgrep@1.135.x

To install, run

npx @tessl/cli install tessl/pypi-semgrep@1.135.0

index.mddocs/

Semgrep

Semgrep is a fast, open-source static analysis tool that searches code, finds bugs, and enforces secure guardrails and coding standards across 30+ programming languages. It provides semantic code search capabilities that go beyond simple string matching, allowing developers to write rules that look like the code they want to find.

Package Information

  • Package Name: semgrep
  • Package Type: pypi
  • Language: Python
  • Installation: pip install semgrep

Core Imports

import semgrep

For programmatic scanning:

from semgrep.run_scan import run_scan, run_scan_and_return_json
from semgrep.config_resolver import get_config, Config

For result processing:

from semgrep.rule_match import RuleMatch, RuleMatches
from semgrep.rule import Rule

Basic Usage

from pathlib import Path
from semgrep.run_scan import run_scan_and_return_json
from semgrep.config_resolver import get_config
from semgrep.output import OutputSettings

# Configure a scan with rules
config, errors = get_config(
    pattern=None,  # Use specific rules instead of pattern
    lang=None,     # Detect languages automatically  
    config_strs=["p/security-audit"],  # Use predefined ruleset
    project_url=None,
    no_rewrite_rule_ids=False,
    replacement=None
)

if errors:
    print(f"Configuration errors: {errors}")
    exit(1)

# Write config to temporary file (required by API)
config_path = Path("/tmp/semgrep_config.yml")
# Note: In practice, you would need to serialize the config to YAML

# Run scan and get JSON results
results = run_scan_and_return_json(
    config=config_path,
    scanning_roots=[Path(".")],  # Scan current directory
    output_settings=OutputSettings()
)

# Process results
if isinstance(results, dict):
    for rule_match in results.get('results', []):
        print(f"Rule: {rule_match['check_id']}")
        print(f"File: {rule_match['path']}")
        print(f"Message: {rule_match['message']}")
        print(f"Severity: {rule_match['extra']['severity']}")

Architecture

Semgrep's architecture consists of several key components that work together to provide comprehensive static analysis:

  • Core Engine: The semgrep-core binary (written in OCaml) handles pattern matching and semantic analysis
  • Python CLI: Provides user interface, configuration management, and result processing
  • Rule System: YAML-based rules that define patterns to search for in code
  • Target Management: File discovery, filtering, and processing pipeline
  • Output Formatters: Multiple output formats for different tools and workflows
  • CI/CD Integration: Built-in support for various continuous integration platforms

This design allows semgrep to efficiently analyze large codebases while providing flexible configuration and integration options for different development workflows.

Capabilities

Core Scanning Engine

Main scanning functionality for running semgrep analysis on codebases, including baseline scanning, dependency-aware analysis, and result processing.

def run_scan(target_manager, config, **kwargs): ...
def run_scan_and_return_json(target_manager, config, **kwargs): ...
def baseline_run(baseline_handler, **kwargs): ...

Core Scanning

Configuration Management

Tools for loading, validating, and managing semgrep configurations from various sources including local files, registries, and cloud platforms.

class Config: ...
class ConfigLoader: ...
def get_config(pattern, lang, configs, **kwargs): ...
def resolve_config(config_strings): ...

Configuration

Rule and Match Processing

Classes and functions for working with semgrep rules and processing scan results, including rule validation and match filtering.

class Rule: ...
class RuleMatch: ...
class RuleMatches: ...
def validate_single_rule(rule_dict): ...

Rules and Matches

Output and Formatting

Comprehensive output formatting system supporting multiple formats including JSON, SARIF, text, and XML for integration with various tools and workflows.

class OutputHandler: ...
class OutputSettings: ...
class JsonFormatter: ...
class SarifFormatter: ...

Output Formatting

Error Handling System

Exception hierarchy and error management functions for handling various types of errors that can occur during scanning and rule processing.

class SemgrepError(Exception): ...
class SemgrepCoreError(SemgrepError): ...
class InvalidRuleSchemaError(SemgrepError): ...
def select_real_errors(errors): ...

Error Handling

CI/CD Integration

Classes and utilities for integrating semgrep into various continuous integration and deployment platforms with automatic metadata detection.

class GitMeta: ...
class GithubMeta(GitMeta): ...
class GitlabMeta(GitMeta): ...
class CircleCIMeta(GitMeta): ...

CI/CD Integration

Target Management

File discovery, filtering, and processing system for managing scan targets with support for language detection and exclusion patterns.

class TargetManager: ...
class ScanningRoot: ...
class Target: ...
class FilteredFiles: ...

Target Management