Production-grade platform engineering handbook — Kubernetes, Terraform, Flux CD, GitHub Actions, AWS, and more.
67
84%
Does it follow best practices?
Impact
—
No eval scenarios have been run
Passed
No known issues
Copy these prompts into Claude, Codex, Cursor, or GitHub Copilot. In Claude and Codex, keep $platform-skills when the integration is installed. In Cursor and Copilot, the same prompts work as natural language after you install the project rules or Copilot instructions.
Use $platform-skills to review this change for production readiness. Focus on ownership, blast radius, validation, rollback, and security defaults.Act as a senior platform engineer. Review the files I changed and return findings ordered by severity, with exact file references, validation steps, and rollback notes.Use platform-skills patterns to turn this rough infrastructure idea into a safe implementation plan with assumptions, risks, validation, and rollback.Use $platform-skills to review this Kubernetes workload for production readiness: securityContext, resources, probes, lifecycle, HPA, PDB, service account, RBAC, and NetworkPolicy.Generate a production-ready Kubernetes Deployment, Service, HPA, PDB, and NetworkPolicy for this app. Include validation commands and rollback steps.This pod is CrashLoopBackOff. Walk me through evidence collection, likely root causes, safe fixes, validation, and rollback.Use $platform-skills to review this Terraform plan for replacement risk, IAM scope, state impact, provider constraints, cost, compliance, and rollback.Generate a Terraform module skeleton with versions.tf, variables.tf validations, outputs.tf, README.md, examples, and validation commands.Review this IAM policy for least privilege. Flag wildcard actions, wildcard resources, missing conditions, and safer alternatives.Use $platform-skills to debug this Flux Kustomization or HelmRelease that is stuck NotReady. Start with evidence, then root cause, fix, validation, and rollback.Review this Flux repository structure for source ownership, environment overlays, image automation boundaries, secret handling, and promotion safety.Review this Argo CD Application for project isolation, sync policy, prune behavior, namespace creation, drift risk, and rollback path.Use $platform-skills to review this Helm chart for values design, immutable selectors, securityContext, probes, resources, schema validation, and GitOps compatibility.Generate a production-ready Helm chart for this service with values.schema.json, NetworkPolicy, HPA, PDB, probes, resources, and test hooks.Use $platform-skills to harden this GitHub Actions workflow. Check permissions, OIDC, SHA-pinned actions, pull_request_target risk, caching, secrets, and artifact integrity.Generate a GitHub Actions pipeline for Terraform with fmt, validate, tflint, checkov, plan, approval gates, and least-privilege OIDC.Use $platform-skills to review this AWS design for IAM least privilege, network boundaries, encryption, logging, cost, tagging, and rollback.Review this EKS setup for IRSA, node group safety, cluster access, logging, network policy readiness, and upgrade risk.Review this Azure AKS setup for workload identity, RBAC, network policy, private cluster exposure, logging, and policy controls.Use $platform-skills to review this change for SOC 2 control impact. Map findings to access control, encryption, logging, monitoring, backup, and change management.Review this supply chain pipeline for Cosign signing, SBOM generation, provenance, image scanning, dependency pinning, and admission enforcement.Review these Kyverno or OPA policies for audit-first rollout, false positive risk, test coverage, exception handling, and promotion to enforcement.Use $platform-skills to design observability for this service: logs, metrics, traces, SLOs, dashboards, alerts, runbooks, and validation.We have an incident. Build a troubleshooting plan with symptom, evidence, hypotheses, diagnosis commands, safe fixes, validation, prevention, and rollback.Review these alerts for actionability, ownership, severity, burn-rate signal, noise risk, and runbook links.Use $platform-skills to review this KEDA ScaledObject or ScaledJob for trigger auth, min/max replicas, cooldown, fallback, scale-to-zero risk, and validation.Design an event-driven autoscaling setup for this queue or metric. Include scaler choice, authentication, failure behavior, validation, and rollback.Use $platform-skills to review this PR across six dimensions: cost, drift, ownership, compliance, upgrade risk, and rollback feasibility.Summarize this PR for a platform maintainer. Call out risky files, missing validation, blast radius, and what must be fixed before merge.Triage the review comments on this PR. Classify valid fixes, questions, duplicates, and non-actionable comments. Apply safe fixes only.Create a rollout plan for adopting platform-skills across 50 repositories using Copilot instructions, Cursor rules, Codex skills, and Claude. Include phases, ownership, validation, and rollback.Create a one-page internal announcement for platform-skills. Explain who should use it, how to install it, first prompts to try, and how to report gaps.Use $platform-skills to audit all IAM roles and policies in this Terraform module.
Flag wildcard actions, wildcard resources, missing conditions, overly broad assume-role trust policies, and unused permissions.Review this supply chain configuration for software composition risks: pinned action SHAs, SBOM generation, image signing, dependency provenance, and artifact attestation.Generate an OPA/Rego policy that enforces these security controls on all Kubernetes Deployments:
non-root containers, read-only root filesystem, dropped capabilities, no hostPID or hostNetwork, and resource limits required.Review this Falco rule set for coverage gaps, false-positive risk, and missing detections for privilege escalation, lateral movement, and data exfiltration in a Kubernetes environment.Use $platform-skills to produce a threat model for this architecture diagram. Identify trust boundaries, attack vectors, blast radius, and prioritized mitigations.Use $platform-skills to walk me through this incident symptom: [paste error, alert, or log]. Start with evidence collection, then root cause hypothesis, safe fix options, validation steps, and rollback.Generate a runbook for this failure mode: [describe the failure]. Include detection signals, triage steps, escalation path, remediation commands, validation, and post-incident actions.Review these SLO definitions for correctness: error budget calculation, burn rate alert thresholds, window alignment, and whether the SLIs actually measure user experience.Use $platform-skills to design a chaos experiment for this service. Include steady-state hypothesis, failure injection method, blast radius, abort conditions, and success criteria.Generate capacity planning estimates for this workload at 2x and 5x current traffic. Include pod count, node count, RDS instance size, and NAT gateway throughput.I'm onboarding to this platform. Walk me through what I need to know: cluster access, namespaces, secrets management, deploy process, observability, and how to get help.Use $platform-skills to review my PR before I ask for human review. Check for: missing tests, unsafe Kubernetes defaults, hardcoded config, secrets in code, and missing rollback plan.Generate a deploy checklist for this service going to production for the first time.
Include: health check endpoints, runbook location, alerting coverage, rollback procedure, and feature flag state.This deploy just went wrong. Walk me through a safe rollback: how to detect the blast radius, the rollback commands, how to validate it worked, and what to document in the incident channel..claude-plugin
.github
commands
docs
examples
agent-self-improve
argocd
awesome-docs
aws
cloudfront
functions
lambda-edge
functions
azure
compliance
conventional-commits
datadog
llm-observability
demo
documentation
dora
dynatrace
fluxcd
github-actions
composite-actions
configure-cloud
db-migrate
docker-build-push
k8s-deploy
notify-slack
pr-comment
release-tag
security-scan
setup-env
setup-terraform
terraform-plan
helm
web-service
templates
kubernetes
kyverno
mcp
observability
openshift
pr-review
ownership
runtime-security
supply-chain
terraform
references
scripts
skills
platform-skills
tests