Complete PromQL toolkit with generation and validation capabilities
94
94%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Advisory
Suggest reviewing before use
{
"instructions": [
{
"instruction": "Always engage the user in collaborative planning before generating any query — never skip the planning phase",
"original_snippets": "CRITICAL: Always engage the user in collaborative planning before generating any query. Never skip the planning phase.",
"relevant_when": "At the start of any PromQL generation request",
"why_given": "preference"
},
{
"instruction": "Follow the 7-stage workflow: understand goal, identify metrics, determine parameters, present query plan, generate, validate, deliver",
"original_snippets": "Workflow (7 stages): 1. Understand the goal...2. Identify metrics...3. Determine parameters...4. Present the query plan...5. Generate the query...6. Validate...7. Deliver",
"relevant_when": "For every PromQL generation task",
"why_given": "preference"
},
{
"instruction": "Present a plain-English query plan and ask for confirmation before writing any code",
"original_snippets": "Present the query plan — Before writing any code, present a plain-English plan (goal, query structure, expected output, example interpretation) and ask for confirmation via AskUserQuestion",
"relevant_when": "Before generating any PromQL query",
"why_given": "preference"
},
{
"instruction": "Use rate() or increase() for counters; use direct use or *_over_time() for gauges — never rate() on a gauge",
"original_snippets": "Match functions to metric types — rate()/increase() on counters; *_over_time() or direct use for gauges...NEVER use rate() on a gauge metric...rate() computes per-second rate of increase and assumes monotonically increasing counters. Applied to a gauge, it produces nonsensical results",
"relevant_when": "Whenever the query involves counter or gauge metrics",
"why_given": "new knowledge"
},
{
"instruction": "Use histogram_quantile() for histogram latency queries — never avg() across quantile labels",
"original_snippets": "NEVER query a histogram with avg() across quantile labels...Summary metrics expose pre-aggregated {quantile='0.95'} labels that cannot be re-aggregated across instances...GOOD: histogram_quantile(0.95, sum by (le) (rate(http_request_duration_seconds_bucket[5m])))",
"relevant_when": "When computing percentile latency from histogram or summary metrics",
"why_given": "new knowledge"
},
{
"instruction": "Always add label filters to reduce cardinality and improve performance",
"original_snippets": "Always add label filters — reduces cardinality and improves performance...Key Rules: 1. Always add label filters",
"relevant_when": "When writing any PromQL selector or aggregation",
"why_given": "reminder"
},
{
"instruction": "Prefer exact label matches over regex when the value is known",
"original_snippets": "Prefer exact label matches over regex when the value is known.",
"relevant_when": "When specifying label matchers in PromQL selectors",
"why_given": "preference"
},
{
"instruction": "Prefer by()/without() on all aggregations",
"original_snippets": "Prefer by()/without() on all aggregations.",
"relevant_when": "When writing any aggregation operator (sum, count, avg, max, min, etc.)",
"why_given": "new knowledge"
},
{
"instruction": "Never use a high-cardinality label in by() without filtering first (avoid user_id, request_id, pod without filters)",
"original_snippets": "NEVER use a high-cardinality label in by() without filtering first...Aggregating by user_id, request_id, or pod without label filters produces thousands of series...GOOD: sum by (job, status_code) (rate(http_requests_total{job='api'}[5m]))",
"relevant_when": "When writing sum by() or any aggregation that might group by a high-cardinality label",
"why_given": "new knowledge"
},
{
"instruction": "Use recording rules for expensive or frequently-reused queries, with naming convention level:metric:operations",
"original_snippets": "Use recording rules for queries that are expensive or reused frequently (naming: level:metric:operations)...Recording rule (naming: level:metric:operations): - record: job:http_requests:rate5m",
"relevant_when": "When the user asks about recording rules or when a query is expensive/reused",
"why_given": "new knowledge"
},
{
"instruction": "Always include a for clause on alert rules — never omit it to avoid false positives from transient spikes",
"original_snippets": "NEVER omit the for clause on alert rules...Alerts without for fire immediately on a single evaluation, causing false positives from transient spikes...GOOD: Add for: 5m to require the condition to hold for 5 minutes before firing",
"relevant_when": "When writing Prometheus alerting rules",
"why_given": "new knowledge"
},
{
"instruction": "Never use increase() for alerting thresholds — use rate() instead for stable, scrape-interval-independent thresholds",
"original_snippets": "NEVER use increase() for alerting thresholds...increase() is extrapolated and can return non-integer values on sparse counters. For alerting, rate() produces stable per-second thresholds...BAD: increase(http_requests_total{status=~'5..'}[5m]) > 10. GOOD: rate(http_requests_total{status=~'5..'}[5m]) > 0.033",
"relevant_when": "When writing alert rule expressions based on counters",
"why_given": "new knowledge"
},
{
"instruction": "Format multi-line for complex queries",
"original_snippets": "Format multi-line for complex queries.",
"relevant_when": "When generating queries with multiple operators, aggregations, or binary operations",
"why_given": "preference"
},
{
"instruction": "Cite the applicable pattern from the reference files when generating a query",
"original_snippets": "Once confirmed, read the relevant reference file(s) before writing code, cite the applicable pattern, and apply the best practices below.",
"relevant_when": "When generating any PromQL query",
"why_given": "preference"
},
{
"instruction": "Deliver final output with: the query, plain-English explanation, usage instructions (dashboard/alert/recording rule), customization notes, and related query suggestions",
"original_snippets": "Deliver — Provide the final query, plain-English explanation, usage instructions (dashboard / alert / recording rule), customization notes, and related query suggestions.",
"relevant_when": "After generating and validating a PromQL query",
"why_given": "preference"
},
{
"instruction": "Invoke devops-skills:promql-validator after generating — display structured results (syntax, best practices, explanation) and fix any issues",
"original_snippets": "Validate — Automatically invoke devops-skills:promql-validator. Display structured results (syntax, best practices, explanation). Fix any issues and re-validate until all checks pass.",
"relevant_when": "After generating any PromQL query",
"why_given": "preference"
},
{
"instruction": "Use sum by (le) pattern when computing histogram_quantile to preserve the le label",
"original_snippets": "P95 latency (classic histogram): histogram_quantile(0.95, sum by (le) (rate(http_request_duration_seconds_bucket{job='api-server'}[5m])))",
"relevant_when": "When writing histogram_quantile queries against classic (bucketed) histograms",
"why_given": "new knowledge"
},
{
"instruction": "For multi-window burn-rate alerts, combine short and long window conditions with 'and'",
"original_snippets": "Multi-window burn-rate alert (page: 2% budget in 1h, burn rate 14.4): ...> 14.4 * 0.001 and ... > 14.4 * 0.001",
"relevant_when": "When implementing SLO burn-rate alerting",
"why_given": "new knowledge"
},
{
"instruction": "If already specified by user (e.g., '5-minute window', '> 5% error rate'), acknowledge as pre-filled defaults rather than re-asking",
"original_snippets": "If the user already specified values (e.g., '5-minute window', '> 5% error rate'), acknowledge them as pre-filled defaults and allow quick confirmation rather than re-asking.",
"relevant_when": "During the planning phase when the user's request already contains specifics",
"why_given": "preference"
}
]
}