Production-grade platform engineering handbook — Kubernetes, Terraform, Flux CD, GitHub Actions, AWS, and more.
67
84%
Does it follow best practices?
Impact
—
No eval scenarios have been run
Passed
No known issues
A simple GitOps repository structure using Kustomize overlays for environment differences.
basic-monorepo/
├── clusters/
│ ├── production/
│ │ ├── flux-system/ # Flux bootstrap
│ │ ├── infrastructure.yaml # Infrastructure Kustomization
│ │ └── apps.yaml # Apps Kustomization
│ └── staging/
│ ├── flux-system/
│ ├── infrastructure.yaml
│ └── apps.yaml
├── infrastructure/
│ ├── base/ # Shared infrastructure
│ │ ├── kustomization.yaml
│ │ ├── ingress-nginx/
│ │ └── cert-manager/
│ ├── production/ # Production overrides
│ │ └── kustomization.yaml
│ └── staging/ # Staging overrides
│ └── kustomization.yaml
└── apps/
├── base/ # Base app definitions
│ ├── kustomization.yaml
│ └── my-app/
├── production/ # Production config
│ └── kustomization.yaml
└── staging/ # Staging config
└── kustomization.yamlgit clone https://github.com/YOUR_ORG/YOUR_REPO.git
cd YOUR_REPOflux bootstrap github \
--owner=YOUR_ORG \
--repository=YOUR_REPO \
--branch=main \
--path=clusters/production \
--personal=falseflux bootstrap github \
--owner=YOUR_ORG \
--repository=YOUR_REPO \
--branch=main \
--path=clusters/staging \
--personal=falseEach cluster defines what to reconcile:
# clusters/production/infrastructure.yaml
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: infrastructure
namespace: flux-system
spec:
interval: 10m
path: ./infrastructure/production
prune: true
sourceRef:
kind: GitRepository
name: flux-system
wait: true
timeout: 5mApps depend on infrastructure:
# clusters/production/apps.yaml
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: apps
namespace: flux-system
spec:
dependsOn:
- name: infrastructure
interval: 5m
path: ./apps/production
prune: true
sourceRef:
kind: GitRepository
name: flux-systemStaging references base with patches:
# infrastructure/staging/kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- ../base
patches:
- target:
kind: Deployment
name: ingress-nginx-controller
patch: |-
- op: replace
path: /spec/replicas
value: 1 # Staging uses fewer replicasProduction references base without changes or with production-specific patches:
# infrastructure/production/kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- ../base
patches:
- target:
kind: Deployment
name: ingress-nginx-controller
patch: |-
- op: replace
path: /spec/replicas
value: 3 # Production uses more replicasInfrastructure must be ready before apps:
spec:
dependsOn:
- name: infrastructureBlock until resources are healthy:
spec:
wait: true
timeout: 5mKeep environment differences small. Most configuration should be in base.
mkdir -p apps/base/new-app
cat <<EOF > apps/base/new-app/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: new-app
namespace: default
spec:
replicas: 2
selector:
matchLabels:
app: new-app
template:
metadata:
labels:
app: new-app
spec:
containers:
- name: app
image: nginx:1.25.0
ports:
- containerPort: 80
EOFcat <<EOF >> apps/base/kustomization.yaml
resources:
- new-app/
EOF# apps/production/kustomization.yaml
resources:
- ../base
patches:
- target:
kind: Deployment
name: new-app
patch: |-
- op: replace
path: /spec/replicas
value: 5 # More replicas in productiongit add apps/
git commit -m "Add new-app deployment"
git pushflux reconcile kustomization apps --with-source
kubectl get deployment new-app -wflux get kustomizations -Aflux logs --kind=kustomize-controller --since=10mflux reconcile kustomization apps --with-sourcekustomize build apps/production.claude-plugin
.github
commands
docs
examples
agent-self-improve
argocd
awesome-docs
aws
cloudfront
functions
lambda-edge
functions
azure
compliance
conventional-commits
datadog
llm-observability
demo
documentation
dora
dynatrace
fluxcd
github-actions
composite-actions
configure-cloud
db-migrate
docker-build-push
k8s-deploy
notify-slack
pr-comment
release-tag
security-scan
setup-env
setup-terraform
terraform-plan
helm
web-service
templates
kubernetes
kyverno
mcp
observability
openshift
pr-review
ownership
runtime-security
supply-chain
terraform
references
scripts
skills
platform-skills
tests