Name: tessl/pypi-crosshair-tool
Rating: 86 (1 reviews)
Author: tessl

tessl/pypi-crosshair-tool

Analyze Python code for correctness using symbolic execution and SMT solving to automatically find counterexamples for functions with type annotations and contracts.

1.24x

Quality

Pending

Does it follow best practices?

Impact

86%

1.24x

Average score across 10 eval scenarios

Securityby

Pending

The risk profile of this skill

Overview

Eval results

Files

{
  "context": "This criteria evaluates how well the engineer uses the crosshair-tool package to implement behavioral difference detection between two Python functions. The focus is on correctly utilizing CrossHair's diff_behavior functionality and related APIs for symbolic execution-based comparison.",
  "type": "weighted_checklist",
  "checklist": [
    {
      "name": "Import diff_behavior",
      "description": "Imports the diff_behavior function from crosshair.diff_behavior module to access the behavioral comparison functionality",
      "max_score": 15
    },
    {
      "name": "Call diff_behavior",
      "description": "Invokes the diff_behavior() function to compare two functions symbolically, passing both function references as arguments (fn1 and fn2 parameters)",
      "max_score": 30
    },
    {
      "name": "Iterate over results",
      "description": "Iterates over the generator returned by diff_behavior() to retrieve BehaviorDiff instances representing discovered differences",
      "max_score": 20
    },
    {
      "name": "Extract input arguments",
      "description": "Accesses the .args property of BehaviorDiff objects to extract the input arguments that caused the behavioral difference",
      "max_score": 20
    },
    {
      "name": "Configure analysis options",
      "description": "Passes AnalysisOptions to the options parameter of diff_behavior() to control timeout settings like per_path_timeout or per_condition_timeout",
      "max_score": 15
    }
  ]
}

tessl/pypi-crosshair-tool

criteria.json.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}evals/scenario-5/

criteria.jsonevals/scenario-5/