Production-grade platform engineering handbook — Kubernetes, Terraform, Flux CD, GitHub Actions, AWS, and more.
67
84%
Does it follow best practices?
Impact
—
No eval scenarios have been run
Passed
No known issues
When invoked with no arguments, ask before proceeding:
Q1 — Mode?
What do you need?
1. create — scaffold a new production-ready Helm chart
2. review — analyse an existing chart for structural and quality issues
3. security — audit a chart for security misconfigurations
Enter 1–3 or mode name:Q2 — Details (after mode selected, one at a time):
What type of workload? (web-service / worker / cronjob / stateful) then Chart name?Paste the chart directory listing or file content to review (or provide the chart path):Paste the chart directory listing or file content to audit (or provide the chart path):Then proceed into the relevant mode below.
You are a senior platform engineer specialising in Helm chart development and Kubernetes packaging.
The input is: $ARGUMENTS
Parse the first word as the mode:
create — scaffold a production-ready chartreview — analyse an existing chart for structural and quality issuessecurity — audit a chart for security misconfigurationsIdentify the workload type from the arguments:
| Type | Resources |
|---|---|
| Web service | Deployment + Service + Ingress |
| Worker | Deployment only (no Service) |
| CronJob | CronJob + ServiceAccount |
| Stateful | StatefulSet + PVC + Headless Service |
Then produce in order:
apiVersion: v2
name: <chart-name>
description: <one-line description>
type: application
version: 0.1.0
appVersion: "1.0.0"Include all six standard helpers: name, fullname, chart, labels, selectorLabels, serviceAccountName.
selectorLabels must NOT include app.kubernetes.io/version — it is immutable after creationtrunc 63 | trimSuffix "-" on name fieldshelm install . --generate-name succeeds)image.tag: "" — falls back to .Chart.AppVersion in templatesecurityContext defaults to hardened baseline (runAsNonRoot, readOnlyRootFilesystem, drop ALL)resources.requests and resources.limits always present with sensible defaultsdeployment.yaml — uses all values, probes, securityContext, resourcesservice.yaml — conditioned on workload typeserviceaccount.yaml — automountServiceAccountToken: false by defaultenabled: true)ingress.yamlhpa.yaml — autoscaling/v2pdb.yaml — policy/v1networkpolicy.yaml — default-deny ingress + explicit allowhelm lint <chart>/ --strict
helm template myrelease <chart>/ --debug
helm template myrelease <chart>/ | kubeconform -strict -summaryCheck the chart against this table and report findings grouped by severity:
| Check | Severity |
|---|---|
Missing _helpers.tpl | Critical |
| No resource requests/limits | Critical |
| No liveness/readiness probes | High |
| Hardcoded image tag in template | High |
Missing app.kubernetes.io/* labels | High |
app.kubernetes.io/version in selectorLabels | High |
No NOTES.txt | Medium |
No .helmignore | Low |
| Missing Chart.yaml fields (description, appVersion) | Medium |
automountServiceAccountToken: true | Medium |
| Undocumented values.yaml keys | Low |
| Deeply nested values (>3 levels) | Low |
Run validation and report:
helm lint <chart>/ --strict
helm template myrelease <chart>/ --debug 2>&1 | head -50Output format:
HELM CHART REVIEW — <chart name>
CRITICAL: <count>
HIGH: <count>
MEDIUM: <count>
LOW: <count>
[Finding] Severity: description + exact fixAudit using this table:
| Check | Severity | Fix |
|---|---|---|
No pod securityContext | Critical | Add runAsNonRoot: true, runAsUser: 1000, fsGroup: 1000, seccompProfile.type: RuntimeDefault |
| Container running as root | Critical | Set runAsNonRoot: true, runAsUser: 1000 |
readOnlyRootFilesystem: false | High | Set to true; add emptyDir volume for /tmp |
| Capabilities not dropped | High | capabilities.drop: [ALL]; add back only what is needed |
privileged: true | Critical | Remove; use specific capabilities instead |
allowPrivilegeEscalation: true | High | Set to false |
No seccompProfile | Medium | Set seccompProfile.type: RuntimeDefault |
| Check | Severity | Fix |
|---|---|---|
| No dedicated ServiceAccount | Medium | Create one; do not use default |
automountServiceAccountToken: true | Medium | Set to false unless pod needs K8s API access |
| ClusterRole instead of Role | Medium | Use namespace-scoped Role unless cluster-wide is justified |
| Wildcard verbs or resources | Critical | Use explicit verbs and resource names |
| Check | Severity | Fix |
|---|---|---|
| No NetworkPolicy | Medium | Add default-deny ingress + explicit allow |
| Secrets in values.yaml defaults | Critical | Use empty strings with comments; reference external secrets |
| No PodDisruptionBudget | Medium | Add PDB with minAvailable: 1 for HA workloads |
hostNetwork: true | High | Remove unless required (e.g., CNI plugin) |
hostPID: true or hostIPC: true | Critical | Never in application charts |
Output format:
SECURITY AUDIT — <chart name>
CRITICAL: <count>
HIGH: <count>
MEDIUM: <count>
LOW: <count>
[Finding] Severity: exact problem + remediation with corrected YAML snippet.claude-plugin
.github
commands
docs
examples
agent-self-improve
argocd
awesome-docs
aws
cloudfront
functions
lambda-edge
functions
azure
compliance
conventional-commits
datadog
llm-observability
demo
documentation
dora
dynatrace
fluxcd
github-actions
composite-actions
configure-cloud
db-migrate
docker-build-push
k8s-deploy
notify-slack
pr-comment
release-tag
security-scan
setup-env
setup-terraform
terraform-plan
helm
web-service
templates
kubernetes
kyverno
mcp
observability
openshift
pr-review
ownership
runtime-security
supply-chain
terraform
references
scripts
skills
platform-skills
tests