CtrlK
BlogDocsLog inGet started
Tessl Logo

kopai/root-cause-analysis

Analyze telemetry data for root cause analysis using Kopai CLI. Use when debugging errors, investigating latency issues, tracing request flows across services, or correlating logs with traces. Also use when users report production issues like "why is my API slow", "getting 500 errors", "service is down", "requests are timing out", or any symptom that needs telemetry-based investigation — even if they don't mention traces or observability explicitly.

100

Quality

100%

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

SecuritybySnyk

Passed

No known issues

Overview
Quality
Evals
Security
Files

pattern-distributed.mdrules/

titleimpacttags
Pattern: Distributed FailuresHIGHpattern, distributed, microservices

Pattern: Distributed Failures

Impact: HIGH

Diagnose failures across multiple services.

Workflow

# 1. Find errors across services
npx @kopai/cli traces search --status-code ERROR --limit 50 --json

# 2. Group by service (use jq)
npx @kopai/cli traces search --status-code ERROR --json | jq 'group_by(.ServiceName) | map({service: .[0].ServiceName, count: length})'

# 3. Trace cross-service flow
npx @kopai/cli traces get <traceId> --fields ServiceName,SpanName,StatusCode --json

Analysis Strategy

  1. Identify which service has most errors
  2. Check if failures cascade from upstream
  3. Look for common root cause (shared dependency)
  4. Check resource attributes for infrastructure issues

Key Fields

FieldPurpose
ServiceNameWhich service
ParentSpanIdCall chain
SpanKindCLIENT/SERVER/INTERNAL

SKILL.md

tile.json