Comprehensive toolkit for validating, optimizing, and understanding Prometheus Query Language (PromQL) queries. Use this skill when working with PromQL queries to check syntax, detect anti-patterns, identify optimization opportunities, and interactively plan queries with users.
Overall
score
93%
Does it follow best practices?
Validation for skill structure
This skill uses a two-phase interactive workflow: Phase 1 (Steps 1-4) presents validation results and asks clarifying questions, then stops and waits for user response before proceeding to Phase 2 (Steps 5-7) for tailored recommendations.
When a user provides a PromQL query, follow this workflow:
Run the syntax validation script to check for basic correctness:
python3 .claude/skills/promql-validator/scripts/validate_syntax.py "<query>"Run the best practices checker to detect anti-patterns and optimization opportunities:
python3 .claude/skills/promql-validator/scripts/check_best_practices.py "<query>"Parse and explain what the query does in plain English:
Ask the user clarifying questions to verify the query matches their intent:
⏸️ STOP HERE AND WAIT FOR USER RESPONSE before proceeding to Steps 5-7.
After understanding the user's intent:
When relevant, mention known limitations:
_bytes suffix. Please confirm if this is correct.")Based on validation results:
Reference Examples: When suggesting corrections, cite relevant examples using this format:
As shown in `examples/bad_queries.promql` (lines 91-97):
❌ BAD: `avg(http_request_duration_seconds{quantile="0.95"})`
✅ GOOD: Use histogram_quantile() with histogram bucketsCitation sources:
assets/good_queries.promql - for well-formed patternsassets/optimization_examples.promql - for before/after comparisonsassets/bad_queries.promql - for showing what to avoidreferences/best_practices.md - for detailed explanationsreferences/anti_patterns.md - for anti-pattern deep divesCitation Format: file_path (lines X-Y) with the relevant code snippet quoted
Give the user control:
Claude: "I've validated your query. It's syntactically correct, but I notice it queries http_requests_total without any label filters. This could match thousands of time series. What specific service or endpoint are you trying to monitor?"
User: [provides intent]
Claude: "Great! Based on that, here's an optimized version: rate(http_requests_total{job="api-service", path="/users"}[5m]). This calculates the per-second rate of requests to the /users endpoint over the last 5 minutes. Does this match what you need?"
User: [confirms or asks for changes]
Claude: [provides refined query or alternatives]
Metric types are inferred from naming conventions (e.g., _total, _bytes). Non-standard names may be misclassified — ask the user to confirm when uncertain.
The scripts flag metrics without label selectors, but recording rule metrics (e.g., job:http_requests:rate5m) and low-cardinality cases are legitimate without filters. Users can safely ignore the warning when they know their cardinality is manageable.
Scripts cannot verify whether metrics exist or whether label values are valid in any specific Prometheus deployment. For production use, test queries against an actual Prometheus instance.
The scripts detect common anti-patterns but cannot catch business logic errors, context-specific optimizations (e.g., based on scrape interval or retention), or custom function behavior from extensions.
The skill uses two main Python scripts:
Both scripts output JSON for programmatic parsing and human-readable messages for display.
Install with Tessl CLI
npx tessl i pantheon-ai/promql-validator