CtrlK
BlogDocsLog inGet started
Tessl Logo

cx-incident-management

Use this skill when the user asks to "investigate incident", "triage this alert", "what's firing", "who got paged", "incident response", "check incident status", "SLO breaching", "error budget burned", "check service level", "SLI status", "who was notified", "check notification delivery", "verify alert routing", "MTTR", "incident severity", "error budget", "burn rate", "acknowledge incident", "resolve incident", "production incident", "what alerts are active", "incident timeline", "on-call triage", or wants to triage, manage, or respond to incidents using alerts, SLOs, and notifications.

63

Quality

55%

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

SecuritybySnyk

Passed

No known issues

Optimize this skill with Tessl

npx tessl skill review --optimize ./skills/cx-incident-management/SKILL.md
SKILL.md
Quality
Evals
Security

Incident Management Skill

Use this skill as the gateway for incident triage, SLO monitoring, and notification verification. It orchestrates the full triage workflow - from detection through resolution - and cross-references cx-alerts for deep alert management and cx-telemetry-querying for root cause investigation.


CLI Commands

CommandSubcommandsPurpose
cx incidentslist, get, acknowledge, resolve, close, assign, unassign, events, aggregationsManage and triage incidents
cx sloslist, get, create, update, deleteMonitor and manage SLO definitions
cx alertslist, getCheck which alerts are firing (see cx-alerts skill for full alert management)
cx notifications connectorslist, getVerify notification connector configuration
cx notifications routerslist, getVerify notification routing rules
cx notifications presetslist, getCheck notification preset templates
cx notifications testconnector, destination, preset, routing-condition, template-renderTest notification delivery

Key flags:

  • cx incidents list supports --status (TRIGGERED, ACKNOWLEDGED, RESOLVED), --severity (CRITICAL, WARNING, INFO), --assignee
  • All commands support -o json for structured output and -p <profile> for profile selection
  • cx slos create/update use --from-file <path> (or - for stdin)

Incident Triage Workflow

Step 1: Check Active Incidents

cx incidents list -o json
cx incidents list --status TRIGGERED -o json
cx incidents list --severity CRITICAL -o json

Get an overview of what's happening. Filter by severity for immediate priorities:

cx incidents list -o json | jq '[.[] | select(.severity == "CRITICAL") | {id, name, status, severity, started_at}]'

Step 2: Get Incident Details

cx incidents get <incident-id> -o json
cx incidents events --incident-id <incident-id> -o json

Review the incident timeline and related events to understand scope and progression.

Step 3: Check Related Alerts

cx alerts list -o json

Find which alerts are currently firing. For deep alert inspection, switch to the cx-alerts skill.

cx alerts list -o json | jq '[.[] | select(.is_active == true) | {id, name, severity, last_triggered}]'

Step 4: Review SLO Status

cx slos list -o json
cx slos get <slo-id> -o json

Check if SLOs are breaching or error budgets are burned:

cx slos list -o json | jq '[.[] | {name, status, remaining_budget_percentage}]'

Step 5: Verify Notifications

cx notifications connectors list -o json
cx notifications routers list -o json
cx notifications presets list -o json

Confirm the right people were notified through the correct channels.

Step 6: Pivot to Root Cause

Switch to the cx-telemetry-querying skill to investigate the underlying cause using logs, traces, and metrics.


Incident Actions

Acknowledge

cx incidents acknowledge <incident-id>
cx incidents acknowledge <id1> <id2> <id3>

Resolve

cx incidents resolve <incident-id>
cx incidents resolve <id1> <id2> <id3>

Assign

cx incidents assign <incident-id> --user-id <user-id>

Close

cx incidents close <incident-id>

SLO Management

Creating SLOs

Template from an existing SLO:

cx slos get <existing-slo-id> -o json > slo-template.json
# Edit slo-template.json with new service/threshold
cx slos create --from-file slo-template.json

Monitoring SLO Health

# All SLOs with their status
cx slos list -o json | jq '[.[] | {name, status, target_percentage, remaining_budget}]'

# SLOs that are breaching
cx slos list -o json | jq '[.[] | select(.status != "OK")]'

Notification Debugging

When notifications aren't reaching the right people:

1. Check Connectors

cx notifications connectors list -o json | jq '[.[] | {id, name, type}]'

Verify the expected channels (Slack, PagerDuty, email) exist and are configured.

2. Check Routers

cx notifications routers list -o json | jq '[.[] | {id, name, entity_type}]'

Verify routing rules map the right alert types to the right connectors.

3. Test Notification Delivery

cx notifications test connector --from-file test-connector.json
cx notifications test destination --from-file test-destination.json
cx notifications test preset --from-file test-preset.json
cx notifications test routing-condition --from-file test-condition.json

Incident Aggregations

Get a high-level view of incident patterns:

cx incidents aggregations -o json

Use this to understand incident frequency, MTTR trends, and severity distribution.


Key Principles

  • Triage before deep-dive - check incidents, alerts, and SLOs before querying telemetry data
  • Check SLO burn rate, not just status - a slowly burning SLO needs attention before it breaches
  • Verify notification chain end-to-end - connector exists → router maps correctly → test delivery works
  • Cross-reference with telemetry - use cx-telemetry-querying skill for root cause after triage
  • Acknowledge promptly - acknowledge incidents to signal ownership and stop re-notifications
  • Use incident events for timeline - cx incidents events shows the full incident lifecycle

Related Skills

  • cx-alerts - deep alert management: creating, updating, and inspecting alert definitions
  • cx-telemetry-querying - root cause investigation using logs, metrics, traces, and RUM
  • cx-observability-setup - configure notification channels and routing for alerts
Repository
coralogix/cx-cli
Last updated
Created

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.