utility-skills/discord-connector

Use when connecting a workflow to Discord using the API

1.01x

Quality

90%

Does it follow best practices?

Impact

69%

1.01x

Average score across 3 eval scenarios

Securityby

Advisory

Suggest reviewing before use

Production Incident Triage

Name: utility-skills/discord-connector
Rating: 85.8 (1 reviews)
Author: utility-skills

Problem Description

Volta, a SaaS payments company, uses a Discord channel called #incidents to coordinate during outages. An on-call engineer has just been paged and is trying to get up to speed on an ongoing incident. The last 30 minutes of conversation have been exported for you.

Read through the incident thread and produce a structured triage document that the incoming engineer can use to take over. Also draft a short status message that can be posted into the #incidents channel to update the broader team and any stakeholders who are watching.

Output Specification

Save the triage document to triage.md and the channel status update to status-update.md.

Input Files

The following file is provided as input. Extract it before beginning.

=============== FILE: inputs/incident-thread.txt =============== #incidents — 2024-03-20

[14:01] pagerduty-bot: 🔴 ALERT: checkout-service error rate > 5% for 3m — P1 triggered

[14:02] felix: on it, pulling logs now

[14:04] felix: ok seeing a spike of 500s from checkout-service starting around 13:58. error message is "upstream connect error or disconnect/reset before headers" — looks like a connection issue with payment-gateway-service

[14:06] yuna: I deployed payment-gateway-service v3.8.1 at 13:55 — that timing lines up

[14:07] felix: @yuna can you check if the new version changed the keep-alive timeout config? the errors look like premature connection closes

[14:08] yuna: checking... yes, I see it — keep-alive timeout was accidentally lowered from 30s to 3s in the v3.8.1 config

[14:09] felix: that's almost certainly it. how quickly can you roll back?

[14:10] yuna: I'm initiating rollback to v3.8.0 now

[14:12] yuna: rollback done, monitoring

[14:13] felix: error rate dropping — now at 1.2%

[14:15] felix: back to baseline, looking normal. I'll keep watching for 10 more mins

[14:17] raj: how many users were affected?

[14:18] felix: not sure yet — we don't have transaction-level impact numbers. I'd estimate it based on checkout volume but I haven't pulled that yet

[14:19] yuna: also not sure if any payments actually failed vs just retried successfully on the client side

[14:20] felix: yeah we need to check that. @data-team can someone pull failed vs retried checkout counts for 13:55–14:13?

[14:21] raj: should we file a ticket for the config guard so this can't happen in a future deploy?

[14:22] felix: yes definitely. I'll create a follow-up ticket after things fully stabilize

[14:23] felix: error rate stable at 0.1%, looks resolved. still monitoring

evals

scenario-1

scenario-2

scenario-3

criteria.json

task.md

skills

tile.json

utility-skills/discord-connector

task.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}evals/scenario-3/

Production Incident Triage

Problem Description

Output Specification

Input Files

task.mdevals/scenario-3/