github.com/ArabelaTso/Skills-4-SE
Skill | Added | Review |
|---|---|---|
formal-spec-generator Generate formal specifications (definitions, predicates, invariants, pre/post-conditions) in Isabelle/HOL or Coq from informal requirements, source code, pseudocode, or mathematical descriptions. Use when users need to: (1) Formalize algorithms or data structures, (2) Create function specifications with contracts, (3) Generate predicates and properties for verification, (4) Translate informal requirements into formal logic, (5) Specify invariants for loops or data structures, or (6) Create formal definitions for mathematical concepts. Supports both Isabelle/HOL and Coq equally. | 86 1.00x No change in agent success vs baseline Impact 99% 1.00xAverage score across 3 eval scenarios Securityby Passed No known issues Reviewed: Version: 0f00a4f | |
program-correctness-prover Generate Isabelle or Coq proofs establishing partial or total correctness of imperative programs from code and formal specifications. Use when users need to: (1) Prove program correctness using Hoare logic, (2) Generate verification conditions from pre/postconditions, (3) Construct loop invariants and termination arguments, (4) Verify imperative programs with assignments, conditionals, and loops. Supports both partial correctness (if terminates, postcondition holds) and total correctness (terminates and postcondition holds) for both Isabelle/HOL and Coq. | 95 1.25x Agent success vs baseline Impact 100% 1.25xAverage score across 3 eval scenarios Securityby Passed No known issues Reviewed: Version: 0f00a4f | |
cve-watchlist-action-recommendation-generator Generate prioritized CVE watchlists and actionable security recommendations for repositories. Use when analyzing CVE scan results, creating security reports, prioritizing vulnerability remediation, or generating security gate reports for CI/CD. Takes CVE scan results (JSON/SARIF from npm audit, pip-audit, Snyk), reachability analysis, and cutoff date as input. Combines severity, reachability, exploitability, and dependency criticality to rank CVEs by practical risk. Outputs markdown reports with concrete next-step guidance (immediate upgrade, monitor, ignore with justification, apply mitigation) suitable for issue trackers, security reviews, and CI security gates. | 92 1.38x Agent success vs baseline Impact 86% 1.38xAverage score across 3 eval scenarios Securityby Advisory Suggest reviewing before use Reviewed: Version: 0f00a4f | |
interface-contract-verifier Verify that interface and class contracts (preconditions, postconditions, invariants) are preserved across program versions. Use when validating refactorings, checking API compatibility, verifying design-by-contract implementations, or ensuring behavioral contracts remain intact after code changes. Automatically detects contract violations, identifies affected methods and classes, and provides actionable guidance for resolving violations while maintaining program correctness. | 84 2.02x Agent success vs baseline Impact 95% 2.02xAverage score across 3 eval scenarios Securityby Passed No known issues Reviewed: Version: 0f00a4f | |
git-bisect-assistant Automatically performs git bisect to identify the first bad commit that introduced a bug or failure. Use when debugging regressions, tracking down when a test started failing, or identifying which commit broke functionality. Handles flaky tests with retry logic and provides comprehensive reports with bisect logs and confidence levels. | 86 2.27x Agent success vs baseline Impact 100% 2.27xAverage score across 3 eval scenarios Securityby Passed No known issues Reviewed: Version: 0f00a4f | |
program-to-tlaplus-spec-generator Automatically generate TLA+ specifications from program code, repositories, or system implementations. Use when asked to generate TLA+ spec, create TLA+ specification from code, convert program to TLA+, formalize system in TLA+, extract TLA+ model from code, or when working with formal specification of concurrent systems, distributed systems, protocols, algorithms, or state machines that need to be verified. | 89 1.04x Agent success vs baseline Impact 95% 1.04xAverage score across 3 eval scenarios Securityby Passed No known issues Reviewed: Version: 0f00a4f | |
program-to-model-extractor Extract abstract mathematical models from functional code (Haskell, OCaml, F#) for formal reasoning in Isabelle/HOL. Use when users need to: (1) Convert functional programs to Isabelle definitions, (2) Extract high-level algorithm essence from implementation code, (3) Generate formal specifications and properties from code, (4) Create verification-ready models that capture mathematical properties while abstracting away implementation details. Focuses on structural recursion, algebraic data types, higher-order functions, and invariant extraction. | 90 1.11x Agent success vs baseline Impact 98% 1.11xAverage score across 3 eval scenarios Securityby Passed No known issues Reviewed: Version: 0f00a4f | |
lemma-discovery-assistant Analyze failed or stuck proofs and propose auxiliary lemmas to help complete the proof in Isabelle/HOL or Coq. Use when encountering proof failures, stuck proof states, unprovable subgoals, or when needing to strengthen induction hypotheses. Identifies missing lemmas, suggests proof strategies, and generates helper lemmas with appropriate statements and proof sketches. Supports inductive proofs, case analysis, rewriting, and complex proof obligations. | 93 1.13x Agent success vs baseline Impact 93% 1.13xAverage score across 3 eval scenarios Securityby Passed No known issues Reviewed: Version: 0f00a4f | |
cve-reachability-analyzer Analyze CVE reachability in software repositories by examining how vulnerable dependencies are imported and used. Determines whether vulnerable components, classes, or functions are reachable from project code through call chain analysis, reflection detection, dynamic loading patterns, and configuration-gated behavior. Classifies each CVE as likely reachable, possibly reachable, or likely unreachable with supporting evidence. Use when analyzing security vulnerabilities in dependencies, performing post-disclosure CVE triage, assessing vulnerability impact, or when users ask to analyze CVE reachability, check if vulnerabilities are exploitable, or evaluate dependency security risks. | 90 1.05x Agent success vs baseline Impact 99% 1.05xAverage score across 3 eval scenarios Securityby Passed No known issues Reviewed: Version: 0f00a4f | |
legacy-code-summarizer Produces comprehensive summaries and insights about legacy codebases to help understand unfamiliar code. Use when onboarding to a new project, planning refactoring efforts, assessing code for acquisition/migration, or generating documentation for undocumented systems. Analyzes architecture, dependencies, code quality issues, and test coverage. Creates high-level overviews with architecture diagrams, key components, entry points, and actionable insights for understanding and improving legacy code. | 81 1.13x Agent success vs baseline Impact 90% 1.13xAverage score across 3 eval scenarios Securityby Passed No known issues Reviewed: Version: 0f00a4f | |
conflict-analyzer Identifies and analyzes conflicts in software requirements including logical contradictions, technical incompatibilities, resource constraints, timeline issues, data conflicts, and stakeholder priority mismatches. Use when reviewing requirement sets, specifications, user stories, or project plans to detect conflicts that could block implementation or cause rework. Provides detailed conflict analysis with resolution strategies and impact assessment. | 94 1.15x Agent success vs baseline Impact 98% 1.15xAverage score across 3 eval scenarios Securityby Passed No known issues Reviewed: Version: 0f00a4f | |
proof-skeleton-generator Generate structured proof skeletons with tactics, strategies, and intermediate lemmas for theorems in Isabelle/HOL or Coq. Use when users need to: (1) Create proof outlines for theorem statements, (2) Generate proof structure with tactic placeholders, (3) Identify key lemmas needed for a proof, (4) Plan proof strategies (induction, case analysis, forward/backward reasoning), (5) Scaffold proofs with intermediate steps and subgoals, or (6) Convert theorem statements into detailed proof templates. Supports both Isabelle/HOL and Coq equally. | 88 1.24x Agent success vs baseline Impact 92% 1.24xAverage score across 3 eval scenarios Securityby Passed No known issues Reviewed: Version: 0f00a4f | |
metamorphic-test-generator Generate test cases using metamorphic testing by applying transformations based on metamorphic properties. Use when you need to expand test suites, test programs without oracles, validate mathematical or algorithmic properties, or detect subtle bugs through input-output relationships. The skill takes a program, original test cases, and metamorphic properties as input, generates new test cases by applying transformations, executes tests, verifies outputs satisfy properties, reports violations and anomalies, and outputs an expanded test suite with property coverage summary. Supports multiple programming languages and property types. | 81 1.57x Agent success vs baseline Impact 88% 1.57xAverage score across 3 eval scenarios Securityby Passed No known issues Reviewed: Version: 0f00a4f | |
mocking-test-generator Generate unit tests with proper mocking for Python (unittest.mock/pytest) or Java (Mockito/JUnit) code. Use when users request test generation, unit tests with mocks, or testing code that has external dependencies like database calls, API requests, file I/O, or network operations. Automatically identifies dependencies to mock and creates executable, maintainable test code. | 89 1.12x Agent success vs baseline Impact 91% 1.12xAverage score across 3 eval scenarios Securityby Passed No known issues Reviewed: Version: 0f00a4f | |
nl-to-constraints Transforms natural language requirements (user stories, verbal descriptions, business rules) into formal specifications and constraints. Use when converting informal requirements into structured, testable specifications with explicit constraints. Outputs in multiple formats including BDD-style Given-When-Then, JSON Schema, and structured plain text requirements documents. | 93 1.62x Agent success vs baseline Impact 96% 1.62xAverage score across 3 eval scenarios Securityby Passed No known issues Reviewed: Version: 0f00a4f | |
interval-profiling-performance-analyzer Profile programs at the function/method level to identify performance hotspots, bottlenecks, and optimization opportunities. Records execution time, memory usage, and call frequency for each interval. Generates actionable recommendations and visualizations. Use when users need to (1) analyze program performance, (2) identify slow functions or bottlenecks, (3) optimize execution time or memory usage, (4) profile Python, Java, or C/C++ programs with test cases or workload scenarios, or (5) generate performance reports with flame graphs and recommendations. | 91 1.23x Agent success vs baseline Impact 99% 1.23xAverage score across 3 eval scenarios Securityby Passed No known issues Reviewed: Version: 0f00a4f | |
proof-carrying-code-generator Generate executable code together with formal proofs certifying safety and correctness properties in Isabelle/HOL or Coq. Use when building verified software, safety-critical systems, or when formal guarantees are required. Produces code with accompanying proofs for memory safety, bounds checking, functional correctness, invariant preservation, and termination. Supports extraction to OCaml/Haskell/SML and integration with existing codebases. | 93 1.01x Agent success vs baseline Impact 92% 1.01xAverage score across 3 eval scenarios Securityby Passed No known issues Reviewed: Version: 0f00a4f | |
module-level-code-translator Translate source code between programming languages at function, class, and module levels while preserving behavior and generating verification tests. Use when translating code from one language to another (e.g., "translate this Python module to JavaScript", "convert this Java class to C#", "port this code to Go and generate tests"), migrating codebases between languages, or creating equivalent implementations across different technology stacks. Handles idiom adaptation, standard library mappings, and test generation. | 89 1.03x Agent success vs baseline Impact 96% 1.03xAverage score across 3 eval scenarios Securityby Risky Do not use without reviewing Reviewed: Version: 0f00a4f | |
function-class-generator Generate complete, production-ready functions and classes from formal specifications, design descriptions, type signatures, or natural language requirements. Use this skill when implementing APIs from specifications, creating data structures from schemas, building classes from UML diagrams, generating code from contracts, or translating design documents into code. Supports multiple programming languages and follows language-specific best practices. | 81 1.16x Agent success vs baseline Impact 94% 1.16xAverage score across 3 eval scenarios Securityby Passed No known issues Reviewed: Version: 0f00a4f | |
dependency-resolver Identify, analyze, and manage software dependencies before deployment. Use this skill when preparing applications for deployment, resolving dependency conflicts, updating dependencies, auditing security vulnerabilities, managing package versions, or troubleshooting dependency-related issues. Supports multiple package managers (npm, pip, maven, cargo, go mod, composer) and provides actionable recommendations for dependency management. | 77 1.02x Agent success vs baseline Impact 71% 1.02xAverage score across 3 eval scenarios Securityby Advisory Suggest reviewing before use Reviewed: Version: 0f00a4f | |
proof-refactoring-assistant Restructure and improve Isabelle or Coq proofs to enhance readability, modularity, and maintainability without changing semantics. Use when proofs are long and monolithic, have repeated patterns, use unclear naming, lack documentation, or when the user asks to refactor, clean up, improve, or reorganize their formal proofs. | 91 1.00x No change in agent success vs baseline Impact 100% 1.00xAverage score across 3 eval scenarios Securityby Passed No known issues Reviewed: Version: 0f00a4f | |
interval-guided-regression-test-update Automatically updates regression tests based on interval analysis to maintain coverage of key program intervals. Use when code changes affect value ranges, conditionals, or control flow, and existing tests need updating to maintain interval coverage. Analyzes interval information from updated code, identifies coverage gaps, adjusts test inputs and assertions, removes redundant tests, and generates new tests for uncovered intervals. Supports Python, Java, JavaScript, and C/C++ with various test frameworks (pytest, JUnit, Jest, Google Test). | 79 0.92x Agent success vs baseline Impact 88% 0.92xAverage score across 3 eval scenarios Securityby Passed No known issues Reviewed: Version: 0f00a4f | |
exploitability-analyzer Analyze detected vulnerabilities to assess realistic exploitability by examining control flow, input sources, sanitization logic, and execution context. Use when users need to: (1) Determine if a vulnerability is actually exploitable in practice, (2) Assess severity and impact of security issues, (3) Prioritize vulnerability remediation, (4) Understand attack vectors and exploitation conditions, (5) Generate exploitability reports with proof-of-concept scenarios. Focuses on injection vulnerabilities (SQL, command, XSS, path traversal, LDAP) with detailed analysis of reachability, controllability, sanitization, and impact. | 91 1.32x Agent success vs baseline Impact 97% 1.32xAverage score across 3 eval scenarios Securityby Passed No known issues Reviewed: Version: 0f00a4f | |
incremental-java-programmer Incrementally implement new features in Java repositories from natural language descriptions. Use when adding functionality to existing Java codebases (Maven or Gradle projects). Takes a feature description as input and outputs modified repository with implementation code, corresponding JUnit tests, and verification that all tests pass. Supports method additions, new class creation, and method modifications with proper Java conventions. | 91 1.04x Agent success vs baseline Impact 99% 1.04xAverage score across 3 eval scenarios Securityby Passed No known issues Reviewed: Version: 0f00a4f | |
framework-migration-assistant Automatically migrate Python web applications between frameworks (Flask → FastAPI, Django → FastAPI). Use when you need to migrate an existing web application to a modern framework while preserving functionality. The skill analyzes the codebase, updates routes, handlers, configuration, dependency injection patterns, and tests. Creates git commits for each migration phase and generates a comprehensive summary of all changes. Supports automatic dependency updates, code transformations, and test adaptations. | 89 1.24x Agent success vs baseline Impact 93% 1.24xAverage score across 3 eval scenarios Securityby Passed No known issues Reviewed: Version: 0f00a4f |