tessl install tessl/pypi-crosshair-tool@0.0.0Analyze Python code for correctness using symbolic execution and SMT solving to automatically find counterexamples for functions with type annotations and contracts.
Agent Success
Agent success rate when using this tile
86%
Improvement
Agent success rate improvement when using this tile compared to baseline
1.25x
Baseline
Agent success rate without this tile
69%
{
"context": "This criteria evaluates how well the engineer uses crosshair-tool's file watching and continuous analysis capabilities to implement a contract monitoring service. The focus is on proper integration with CrossHair's watch functionality, contract checking APIs, and result handling.",
"type": "weighted_checklist",
"checklist": [
{
"name": "Uses watch command",
"description": "Implementation utilizes CrossHair's `watch` command or equivalent API (e.g., subprocess call to 'crosshair watch' or imports from crosshair.watchers module) to monitor directory for file changes.",
"max_score": 25
},
{
"name": "Integrates contract checking",
"description": "Implementation properly invokes CrossHair's contract checking functionality (e.g., using 'crosshair check' command, or importing and calling analysis functions from crosshair.core_and_libs or crosshair.statespace modules) to analyze Python files for contract violations.",
"max_score": 25
},
{
"name": "Handles watch results",
"description": "Implementation correctly captures and processes the output from CrossHair's watch/check operations, parsing the standardized output format (filename:line: error: message) to extract violation details.",
"max_score": 20
},
{
"name": "Implements graceful shutdown",
"description": "Implementation handles interruption signals properly to allow the file watching process to shut down gracefully, using appropriate signal handling or context managers to clean up watch processes.",
"max_score": 15
},
{
"name": "Structured output format",
"description": "Implementation outputs analysis results in the specified JSON structure with file paths, violation details (line numbers and messages), and timestamps as required by the spec.",
"max_score": 15
}
]
}