Production-grade platform engineering handbook — Kubernetes, Terraform, Flux CD, GitHub Actions, AWS, and more.
67
84%
Does it follow best practices?
Impact
—
No eval scenarios have been run
Passed
No known issues
Status: Stable
A Flux HelmRelease that goes NotReady and stays there. Platform-skills identifies why it's stuck and what to fix before the next deploy.
| Finding | Severity | Risk |
|---|---|---|
version: "*" — unpinned chart | Critical | Silent major upgrades; non-reproducible clusters |
No dependsOn | High | HelmRelease deploys before cert-manager CRDs exist → CrashLoopBackOff |
interval: 1h — too long | Medium | Changes take up to 1 hour to reconcile |
No timeout | Medium | Stuck install blocks other Flux reconciliations |
No remediation | Medium | Failed install retries forever without rollback |
replicaCount: 1 | Medium | Single point of failure during node drain |
4.10.1 — upgrade is an explicit PR decisiondependsOn: cert-manager — ensures CRDs exist before ingress deploysinterval: 10m — changes reconcile in minutes, not hourstimeout: 5m — fail fast, unblock other reconciliationsinstall.remediation.retries: 3 + upgrade.remediation.remediateLastFailure: true — automatic rollback on failurereplicaCount: 2 — survives node drain without downtimeflux get helmrelease nginx-ingress -n ingress-system
flux logs --kind HelmRelease --name nginx-ingress --namespace ingress-system
kubectl describe helmrelease nginx-ingress -n ingress-systemflux reconcile helmrelease nginx-ingress -n ingress-system --with-source
flux get helmrelease nginx-ingress -n ingress-system --watchUse $platform-skills to debug this Flux HelmRelease that is stuck NotReady.
Start with evidence collection, then root cause, fix, validation, and rollback..claude-plugin
.github
commands
docs
examples
agent-self-improve
argocd
awesome-docs
aws
cloudfront
functions
lambda-edge
functions
azure
compliance
conventional-commits
datadog
llm-observability
demo
documentation
dora
dynatrace
fluxcd
github-actions
composite-actions
configure-cloud
db-migrate
docker-build-push
k8s-deploy
notify-slack
pr-comment
release-tag
security-scan
setup-env
setup-terraform
terraform-plan
helm
web-service
templates
kubernetes
kyverno
mcp
observability
openshift
pr-review
ownership
runtime-security
supply-chain
terraform
references
scripts
skills
platform-skills
tests