DAVIS problem analysis including root cause identification, impact assessment, and correlation with other telemetry. Use when querying or investigating detected problems. Trigger: "active problems", "root cause analysis", "problem impact", "affected users", "list problems", "P-12345 details", "recurring problems", "problem history", "problem trending", "blast radius", "which entity caused the problem", "problems affecting Kubernetes", "problems by service". Do NOT use for explaining existing queries, product documentation questions, generic log searching, distributed tracing, or host-level resource monitoring.
75
92%
Does it follow best practices?
Impact
—
No eval scenarios have been run
Passed
No known issues
Analyze Dynatrace AI-detected problems including root cause identification, impact assessment, and correlation with logs and metrics.
Dynatrace automatically detects anomalies, performance degradations, and failures across your environment, creating problems that aggregate related alert, warning and info-level events and provide root cause and impact insights.
Problems are automatically detected, software and infrastructure health and resilience issues that:
The event.kind field (stable, permission) identifies the high-level event type:
event.kind value | Description |
|---|---|
DAVIS_EVENT | Davis-detected infrastructure/application events |
BIZ_EVENT | Business events (ingested via API or captured from spans) |
RUM_EVENT | Real User Monitoring events |
AUDIT_EVENT | Administrative/security audit events |
event.provider (stable, permission) identifies the event source.
Common event.category values:
| Category | Description | Example |
|---|---|---|
| AVAILABILITY | Infrastructure or service unavailable | Web service returns no data, synthetic test actively fails, database connection lost |
| ERROR | Increased error rates beyond baseline | API error rate jumped from 0.1% to 15% |
| SLOWDOWN | Performance degradation | Response time increased from 200ms to 5000ms |
| RESOURCE | Resource saturation | Container memory at 95%, causing OOM kills |
| CUSTOM | Custom anomaly detections | Business KPI (orders/minute) dropped below threshold |
Detection → ACTIVE → Under Investigation → CLOSED| ❌ WRONG | ✅ CORRECT | Description |
|---|---|---|
title | event.name | Problem title/description |
status | event.status | Problem lifecycle status |
severity | event.category | Problem type/category |
start | event.start | Problem start time |
// ✅ CORRECT: Use these status values
fetch dt.davis.problems
| filter event.status == "ACTIVE" // Currently occurring problems
// or event.status == "CLOSED" // Resolved problems
// ❌ INCORRECT: event.status == "OPEN" does not exist!
| limit 1fetch dt.davis.problems, from:now() - 1h
| filter not(dt.davis.is_duplicate)
| fields
event.start, // Problem start timestamp
event.end, // Problem end timestamp (if closed)
display_id, // Human-readable problem ID (P-XXXXX)
event.name, // Problem title
event.description, // Detailed description
event.category, // Problem type
event.status, // ACTIVE or CLOSED
dt.smartscape_source.id, // The smartscape ID for the affected resource
dt.davis.affected_users_count, // Number of affected users
smartscape.affected_entity.ids, // Array of affected entity IDs
dt.smartscape.service, // Affected services (may be array)
dt.davis.root_cause_entity, // Entity identified as root cause
root_cause_entity_id, // Root cause entity ID
root_cause_entity_name, // Human-readable root cause name
dt.davis.is_duplicate, // Whether duplicate detection
dt.davis.is_rootcause // Root cause vs. symptom
| limit 10Always start problem queries with this foundation:
fetch dt.davis.problems, from:now() - 2h
| filter not(dt.davis.is_duplicate) and event.status == "ACTIVE"
| fields event.start, display_id, event.name, event.category
| sort event.start desc
| limit 20Key components:
fetch dt.davis.problems - The problems data sourcenot(dt.davis.is_duplicate) - Filter out duplicate detectionsevent.status == "ACTIVE" - Show only active problemsfetch dt.davis.problems
| filter not(dt.davis.is_duplicate) and event.status == "ACTIVE"
| summarize problem_count = count(), by: {event.category}
| sort problem_count descfetch dt.davis.problems
| filter not(dt.davis.is_duplicate) and event.status == "ACTIVE"
| filter dt.davis.affected_users_count > 100
| fields event.start, display_id, event.name, dt.davis.affected_users_count, event.category
| sort dt.davis.affected_users_count descfetch dt.davis.problems
| filter not(dt.davis.is_duplicate) and event.status == "ACTIVE"
| filter arraySize(affected_entity_ids) > 5
| fields event.start, display_id, event.name, affected_entity_ids, event.category, impacted_entity_count = arraySize(affected_entity_ids)
| sort impacted_entity_count descfetch dt.davis.problems
| filter display_id == "P-XXXXXXXXXX"
| fields event.start, event.end, event.name, event.description, affected_entity_ids, dt.davis.affected_users_count, root_cause_entity_id, root_cause_entity_namefetch dt.davis.problems, from:now() - 7d
| filter not(dt.davis.is_duplicate)
| filter in(dt.smartscape.service, toSmartscapeId("SERVICE-XXXXXXXXX"))
| summarize problems = count(), by: {event.category, event.status}fetch dt.davis.problems, from:now() - 24h
| filter not(dt.davis.is_duplicate) and event.status == "ACTIVE"
| fields
display_id,
event.name,
event.description,
root_cause_entity_id,
root_cause_entity_name,
smartscape.affected_entity.idsIdentify which entity types most frequently cause problems:
fetch dt.davis.problems, from:now() - 7d
| filter not(dt.davis.is_duplicate)
| filter isNotNull(root_cause_entity_id)
| summarize problem_count = count(), by:{root_cause_entity_name}
| sort problem_count desc
| limit 20fetch dt.davis.problems, from:now() - 24h
| filter not(dt.davis.is_duplicate) and event.status == "ACTIVE"
| filter matchesPhrase(arrayToString(smartscape.affected_entity.types, delimiter:","), "AWS_")fetch dt.davis.problems, from:now() - 30m
| filter not(dt.davis.is_duplicate) and event.status == "ACTIVE"
| filter matchesPhrase(root_cause_entity_id, "HOST-")
| filter isNotNull(dt.smartscape.service)
| fields display_id, event.name, root_cause_entity_name, dt.smartscape.serviceCalculate entity impact per root cause:
fetch dt.davis.problems, from:now() - 7d
| filter not(dt.davis.is_duplicate)
| filter isNotNull(root_cause_entity_id)
| fieldsAdd affected_count = arraySize(smartscape.affected_entity.ids)
| summarize
avg_affected = avg(affected_count),
max_affected = max(affected_count),
problem_count = count(),
by:{root_cause_entity_name}
| sort avg_affected descIdentify entities repeatedly causing problems:
fetch dt.davis.problems, from:now() - 24h
| filter not(dt.davis.is_duplicate)
| filter isNotNull(root_cause_entity_id)
| summarize
problem_count = count(),
first_occurrence = min(event.start),
last_occurrence = max(event.start),
by:{root_cause_entity_id, root_cause_entity_name}
| filter problem_count > 3
| sort problem_count descThese are different questions — pick the right approach:
event.category
(SLOWDOWN, ERROR, RESOURCE, AVAILABILITY, CUSTOM). Explain what triggers each category.root_cause_entity_name. Lists specific services, hosts, or apps.Cause category breakdown (use when asked about common causes, patterns, or types):
fetch dt.davis.problems, from:now() - 30d
| filter not(dt.davis.is_duplicate)
| summarize problem_count = count(), by: {event.category}
| sort problem_count descThen for each category, explain what triggers it using the Problem Categories table and cite specific entities from the tenant data as examples.
Track problem trends over time, identify recurring issues, and analyze resolution performance.
Primary Files:
references/problem-trending.md - Timeseries analysis and pattern detectionCommon Use Cases:
makeTimeseriesKey Techniques:
makeTimeseries vs bin(): Choose the right approach for lifecycle spans vs discrete eventscoalesce(event.end, now()) for active problemsSee references/problem-trending.md for complete query patterns and best practices.
Use affected_entity_ids or dt.smartscape_source.id to find problems related to Kubernetes:
fetch dt.davis.problems, from:now() - 7d
| filter not(dt.davis.is_duplicate)
| filter matchesPhrase(dt.smartscape_source.id, "KUBERNETES_CLUSTER")
OR matchesPhrase(dt.smartscape_source.id, "K8S_")
| fields event.start, display_id, event.name, event.category, event.status,
dt.smartscape_source.id, affected_entity_ids
| sort event.start descAlternative: expand affected entities and filter for K8s entity types:
fetch dt.davis.problems, from:now() - 7d
| filter not(dt.davis.is_duplicate)
| expand entity_id = affected_entity_ids
| filter matchesPhrase(entity_id, "KUBERNETES_CLUSTER")
OR matchesPhrase(entity_id, "K8S_")
| fields event.start, display_id, event.name, event.category, entity_id
| sort event.start descList all problems from the last 24 hours (common request):
fetch dt.davis.problems, from:now() - 24h
| filter not(dt.davis.is_duplicate)
| fields event.start, event.end, display_id, event.name, event.category, event.status
| sort event.start descWhen summarizing problem causes, categories, or patterns, provide a comprehensive breakdown across all standard categories present in the data: AVAILABILITY, ERROR, SLOWDOWN, RESOURCE, and CUSTOM. For each category found:
Do not stop after the first two categories — users expect the full picture. Reference the Problem Categories table above for trigger descriptions.
When presenting query results:
get-entity-name calls are finequery-problems tool which returns names directly, or
include root_cause_entity_name / entityName() in the DQL query to resolve
names inline. Avoid calling get-entity-name in a loop for 10+ entities —
this can exhaust the tool call limit and return no answer at all.not(dt.davis.is_duplicate) to avoid counting the same problem multiple times"ACTIVE" or "CLOSED", never "OPEN"not(dt.davis.is_duplicate) immediately after fetch| limit 1 to verify field names existdisplay_id for referenceisNotNull(root_cause_entity_id) when requireddt.davis.event_ids// ✅ GOOD - Specific time range
fetch dt.davis.problems, from:now() - 4h// ❌ BAD - Scans all historical data
fetch dt.davis.problems| Problem | Cause | Solution |
|---|---|---|
| No problems returned | Using event.status == "OPEN" | Use "ACTIVE" or "CLOSED" — "OPEN" does not exist |
| Duplicate problems in results | Missing deduplication filter | Add filter not(dt.davis.is_duplicate) immediately after fetch |
Wrong field name (title, status, severity) | SQL-like naming | Use event.name, event.status, event.category — see field name table above |
root_cause_entity_id is null | Not all problems have identified root causes | Add filter isNotNull(root_cause_entity_id) when querying root causes |
| Query scans too much data / times out | Missing time range | Always specify from:now() - <duration> on the fetch command |
affected_entity_ids is empty array | Problem has no mapped affected entities | Check dt.smartscape.service or dt.smartscape_source.id as alternatives |
7cbe1ef
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.