CtrlK
BlogDocsLog inGet started
Tessl Logo

ideogram-incident-runbook

Execute Ideogram incident response procedures with triage, mitigation, and postmortem. Use when responding to Ideogram-related outages, investigating errors, or running post-incident reviews for Ideogram integration failures. Trigger with phrases like "ideogram incident", "ideogram outage", "ideogram down", "ideogram on-call", "ideogram emergency", "ideogram broken".

Install with Tessl CLI

npx tessl i github:jeremylongshore/claude-code-plugins-plus-skills --skill ideogram-incident-runbook
What are skills?

91

Does it follow best practices?

Validation for skill structure

SKILL.md
Review
Evals

Ideogram Incident Runbook

Overview

Rapid incident response procedures for Ideogram-related outages.

Prerequisites

  • Access to Ideogram dashboard and status page
  • kubectl access to production cluster
  • Prometheus/Grafana access
  • Communication channels (Slack, PagerDuty)

Severity Levels

LevelDefinitionResponse TimeExamples
P1Complete outage< 15 minIdeogram API unreachable
P2Degraded service< 1 hourHigh latency, partial failures
P3Minor impact< 4 hoursWebhook delays, non-critical errors
P4No user impactNext business dayMonitoring gaps

Quick Triage

# 1. Check Ideogram status
curl -s https://status.ideogram.com | jq

# 2. Check our integration health
curl -s https://api.yourapp.com/health | jq '.services.ideogram'

# 3. Check error rate (last 5 min)
curl -s localhost:9090/api/v1/query?query=rate(ideogram_errors_total[5m])

# 4. Recent error logs
kubectl logs -l app=ideogram-integration --since=5m | grep -i error | tail -20

Decision Tree

Ideogram API returning errors?
├─ YES: Is status.ideogram.com showing incident?
│   ├─ YES → Wait for Ideogram to resolve. Enable fallback.
│   └─ NO → Our integration issue. Check credentials, config.
└─ NO: Is our service healthy?
    ├─ YES → Likely resolved or intermittent. Monitor.
    └─ NO → Our infrastructure issue. Check pods, memory, network.

Immediate Actions by Error Type

401/403 - Authentication

# Verify API key is set
kubectl get secret ideogram-secrets -o jsonpath='{.data.api-key}' | base64 -d

# Check if key was rotated
# → Verify in Ideogram dashboard

# Remediation: Update secret and restart pods
kubectl create secret generic ideogram-secrets --from-literal=api-key=NEW_KEY --dry-run=client -o yaml | kubectl apply -f -
kubectl rollout restart deployment/ideogram-integration

429 - Rate Limited

# Check rate limit headers
curl -v https://api.ideogram.com 2>&1 | grep -i rate

# Enable request queuing
kubectl set env deployment/ideogram-integration RATE_LIMIT_MODE=queue

# Long-term: Contact Ideogram for limit increase

500/503 - Ideogram Errors

# Enable graceful degradation
kubectl set env deployment/ideogram-integration IDEOGRAM_FALLBACK=true

# Notify users of degraded service
# Update status page

# Monitor Ideogram status for resolution

Communication Templates

Internal (Slack)

🔴 P1 INCIDENT: Ideogram Integration
Status: INVESTIGATING
Impact: [Describe user impact]
Current action: [What you're doing]
Next update: [Time]
Incident commander: @[name]

External (Status Page)

Ideogram Integration Issue

We're experiencing issues with our Ideogram integration.
Some users may experience [specific impact].

We're actively investigating and will provide updates.

Last updated: [timestamp]

Post-Incident

Evidence Collection

# Generate debug bundle
./scripts/ideogram-debug-bundle.sh

# Export relevant logs
kubectl logs -l app=ideogram-integration --since=1h > incident-logs.txt

# Capture metrics
curl "localhost:9090/api/v1/query_range?query=ideogram_errors_total&start=2h" > metrics.json

Postmortem Template

## Incident: Ideogram [Error Type]
**Date:** YYYY-MM-DD
**Duration:** X hours Y minutes
**Severity:** P[1-4]

### Summary
[1-2 sentence description]

### Timeline
- HH:MM - [Event]
- HH:MM - [Event]

### Root Cause
[Technical explanation]

### Impact
- Users affected: N
- Revenue impact: $X

### Action Items
- [ ] [Preventive measure] - Owner - Due date

Instructions

Step 1: Quick Triage

Run the triage commands to identify the issue source.

Step 2: Follow Decision Tree

Determine if the issue is Ideogram-side or internal.

Step 3: Execute Immediate Actions

Apply the appropriate remediation for the error type.

Step 4: Communicate Status

Update internal and external stakeholders.

Output

  • Issue identified and categorized
  • Remediation applied
  • Stakeholders notified
  • Evidence collected for postmortem

Error Handling

IssueCauseSolution
Can't reach status pageNetwork issueUse mobile or VPN
kubectl failsAuth expiredRe-authenticate
Metrics unavailablePrometheus downCheck backup metrics
Secret rotation failsPermission deniedEscalate to admin

Examples

One-Line Health Check

curl -sf https://api.yourapp.com/health | jq '.services.ideogram.status' || echo "UNHEALTHY"

Resources

  • Ideogram Status Page
  • Ideogram Support

Next Steps

For data handling, see ideogram-data-handling.

Repository
jeremylongshore/claude-code-plugins-plus-skills
Last updated
Created

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.