Production-grade platform engineering handbook — Kubernetes, Terraform, Flux CD, GitHub Actions, AWS, and more.
67
84%
Does it follow best practices?
Impact
—
No eval scenarios have been run
Passed
No known issues
Status: Stable
Working examples for tracking the four DORA metrics — Deployment Frequency, Lead Time for Changes, Change Failure Rate, and MTTR — using GitHub Actions, Prometheus Pushgateway, Prometheus recording rules, and Grafana.
| File | Description |
|---|---|
deployment-event-step.yaml | GitHub Actions step: push deploy event to Prometheus Pushgateway |
incident-webhook-handler.yaml | GitHub Actions workflow triggered by PagerDuty/OpsGenie webhook |
prometheus-recording-rules.yaml | All four DORA Prometheus recording rules |
grafana-dashboard.json | Grafana dashboard JSON with four DORA panels and threshold bands |
dora-validate.sh | Domain validator |
amp-variant/ | AMP-specific replacements — see below |
AMP has no public Pushgateway endpoint. Use the files in amp-variant/ alongside the standard files:
| amp-variant file | What it does |
|---|---|
pushgateway-helm-values.yaml | Deploys in-cluster Pushgateway (GitHub Actions still pushes here) |
prometheus-agent-values.yaml | Prometheus Agent: scrapes Pushgateway, remote_writes to AMP via SigV4/IRSA |
amp-recording-rules-deploy.sh | AWS CLI script to create/update DORA rules in AMP (replaces kubectl apply) |
grafana-amp-datasource.yaml | Grafana datasource ConfigMap for self-hosted Grafana → AMP (SigV4) |
grafana-amg-datasource.json | Datasource config for Amazon Managed Grafana |
The deployment-event-step.yaml, incident-webhook-handler.yaml, prometheus-recording-rules.yaml, and grafana-dashboard.json files are unchanged — only the infrastructure wiring differs.
Append deployment-event-step.yaml to your existing production deploy workflow. Set PUSHGATEWAY_URL as a repository secret pointing to your Pushgateway instance.
Deploy incident-webhook-handler.yaml as a GitHub Actions workflow. Configure your PagerDuty or OpsGenie account to send repository_dispatch webhook events to the GitHub API.
kubectl apply -f examples/dora/prometheus-recording-rules.yamlOr add the file to your Prometheus Operator PrometheusRule CRD if using the kube-prometheus-stack Helm chart.
Import grafana-dashboard.json via Grafana UI: Dashboards → Import → Upload JSON file, or provision it via your GitOps pipeline into the Grafana provisioning directory.
Run the domain validator from the repository root:
bash examples/dora/dora-validate.shThe validator checks all YAML files with yq and the JSON dashboard with python3. It exits non-zero if any file fails.
| Tool | Minimum version | Purpose |
|---|---|---|
yq | v4+ | YAML validation |
python3 | 3.6+ | JSON validation |
| Prometheus Pushgateway | any | Receives metric pushes from CI |
| Prometheus | 2.x | Evaluates recording rules |
| Grafana | 9+ | Dashboard rendering (schemaVersion 36) |
.claude-plugin
.github
commands
docs
examples
agent-self-improve
argocd
awesome-docs
aws
cloudfront
functions
lambda-edge
functions
azure
compliance
conventional-commits
datadog
llm-observability
demo
documentation
dora
dynatrace
fluxcd
github-actions
composite-actions
configure-cloud
db-migrate
docker-build-push
k8s-deploy
notify-slack
pr-comment
release-tag
security-scan
setup-env
setup-terraform
terraform-plan
helm
web-service
templates
kubernetes
kyverno
mcp
observability
openshift
pr-review
ownership
runtime-security
supply-chain
terraform
references
scripts
skills
platform-skills
tests