DevOps essentials — Dockerfile best practices, CI/CD patterns, deployment configuration, and container security
89
87%
Does it follow best practices?
Impact
100%
1.21xAverage score across 3 eval scenarios
Passed
No known issues
Any time you create a Dockerfile, docker-compose file, CI/CD pipeline, or prepare an application for deployment, you MUST follow these DevOps best practices. There is no exception.
This is not optional. This is not something to add later. These practices are part of building a production application, just like error handling and testing.
Single-stage builds ship compilers, dev dependencies, and build tools into production. This creates bloated, insecure images.
WRONG — single-stage build with dev dependencies in production:
FROM node:20
WORKDIR /app
COPY . .
RUN npm install
RUN npm run build
CMD ["node", "dist/index.js"]RIGHT — multi-stage build, minimal production image:
# Build stage
FROM node:20-alpine AS build
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
RUN npm run build
# Production stage
FROM node:20-alpine AS production
WORKDIR /app
COPY --from=build /app/dist ./dist
COPY --from=build /app/node_modules ./node_modules
COPY --from=build /app/package*.json ./
ENV NODE_ENV=production
CMD ["node", "dist/index.js"]Python example — multi-stage:
# Build stage
FROM python:3.12-slim AS build
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir --prefix=/install -r requirements.txt
COPY . .
# Production stage
FROM python:3.12-slim AS production
WORKDIR /app
COPY --from=build /install /usr/local
COPY --from=build /app .
CMD ["python", "-m", "uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]Containers run as root by default. A container escape with root access compromises the host. Always create and switch to a non-root user.
WRONG — running as root (the default):
FROM node:20-alpine
WORKDIR /app
COPY . .
CMD ["node", "index.js"]RIGHT — non-root user:
FROM node:20-alpine AS production
WORKDIR /app
COPY --from=build /app/dist ./dist
COPY --from=build /app/node_modules ./node_modules
COPY --from=build /app/package*.json ./
# Create non-root user and set ownership
RUN addgroup -S appgroup && adduser -S appuser -G appgroup
RUN chown -R appuser:appgroup /app
USER appuser
CMD ["node", "dist/index.js"]Python example — non-root:
FROM python:3.12-slim AS production
WORKDIR /app
COPY --from=build /install /usr/local
COPY --from=build /app .
RUN groupadd -r appgroup && useradd -r -g appgroup appuser
RUN chown -R appuser:appgroup /app
USER appuser
CMD ["python", "-m", "uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]This is the most commonly forgotten step. Without .dockerignore, Docker copies everything including node_modules, .git, .env files, and test artifacts into the build context. This bloats the image, leaks secrets, and breaks builds.
CRITICAL: Every time you create a Dockerfile, you MUST also create a .dockerignore file in the same directory (or project root). This is not optional. A Dockerfile without a .dockerignore is incomplete.
Always create a .dockerignore file with at minimum:
node_modules
.git
.gitignore
.env
.env.*
*.md
.vscode
.idea
dist
coverage
.nyc_output
__pycache__
*.pyc
.pytest_cache
.venv
venvDocker caches each layer. When a layer changes, all subsequent layers are invalidated. Copy dependency files first, install, then copy source code. This way source code changes don't re-trigger dependency installation.
WRONG — copies everything first, cache busted on every source change:
FROM node:20-alpine
WORKDIR /app
COPY . .
RUN npm install
CMD ["node", "index.js"]RIGHT — dependency files first, then install, then source:
FROM node:20-alpine AS build
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
RUN npm run buildPython example — cache-efficient:
FROM python:3.12-slim AS build
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .Without a HEALTHCHECK, Docker and orchestrators cannot determine if your application is actually serving traffic or is stuck in a broken state.
HEALTHCHECK --interval=30s --timeout=5s --start-period=10s --retries=3 \
CMD wget --no-verbose --tries=1 --spider http://localhost:3000/health || exit 1Or with curl:
HEALTHCHECK --interval=30s --timeout=5s --start-period=10s --retries=3 \
CMD curl -f http://localhost:8000/health || exit 1Never use :latest — it causes unpredictable builds and breaks reproducibility.
WRONG:
FROM node:latest
FROM python:latestRIGHT:
FROM node:20-alpine
FROM python:3.12-slimEvery CI/CD pipeline must have distinct stages. Tests MUST pass before deployment. Never combine build and deploy into a single step.
GitHub Actions — RIGHT:
name: CI/CD
on:
push:
branches: [main]
pull_request:
branches: [main]
jobs:
lint:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: 20
cache: 'npm'
- run: npm ci
- run: npm run lint
- run: npm run type-check
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: 20
cache: 'npm'
- run: npm ci
- run: npm test
deploy:
needs: [lint, test]
if: github.ref == 'refs/heads/main'
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Deploy
run: echo "Deploy step"
env:
DEPLOY_TOKEN: ${{ secrets.DEPLOY_TOKEN }}Key rules:
needs to enforce stage orderingWRONG — secrets hardcoded or in build args:
env:
DATABASE_URL: "postgresql://user:password@db.example.com/prod"
API_KEY: "sk-abc123"RIGHT — use repository secrets and environment variables:
env:
DATABASE_URL: ${{ secrets.DATABASE_URL }}
API_KEY: ${{ secrets.API_KEY }}All configuration must come from environment variables. Never hardcode connection strings, API keys, ports, or feature flags.
WRONG — hardcoded config:
const db = new Pool({
host: 'db.production.example.com',
port: 5432,
password: 'supersecret',
});
const port = 3000;RIGHT — environment variables with validation:
const requiredEnv = (name: string): string => {
const value = process.env[name];
if (!value) throw new Error(`Missing required environment variable: ${name}`);
return value;
};
const config = {
port: parseInt(process.env.PORT || '3000', 10),
databaseUrl: requiredEnv('DATABASE_URL'),
nodeEnv: process.env.NODE_ENV || 'development',
};Python example:
import os
def required_env(name: str) -> str:
value = os.environ.get(name)
if not value:
raise ValueError(f"Missing required environment variable: {name}")
return value
DATABASE_URL = required_env("DATABASE_URL")
PORT = int(os.environ.get("PORT", "8000"))When a container is stopped or a new version is deployed, the orchestrator sends SIGTERM. Your application must handle it to finish in-flight requests and close connections cleanly. Without this, requests are dropped during deployments.
Node.js — graceful shutdown:
const server = app.listen(config.port, () => {
logger.info({ port: config.port }, 'server_started');
});
function gracefulShutdown(signal: string) {
logger.info({ signal }, 'shutdown_signal_received');
server.close(() => {
logger.info('server_closed');
process.exit(0);
});
// Force shutdown after timeout
setTimeout(() => {
logger.error('forced_shutdown');
process.exit(1);
}, 10_000);
}
process.on('SIGTERM', () => gracefulShutdown('SIGTERM'));
process.on('SIGINT', () => gracefulShutdown('SIGINT'));Python (uvicorn/FastAPI) — graceful shutdown:
import signal
import sys
def handle_shutdown(signum, frame):
logger.info("shutdown_signal_received", signal=signal.Signals(signum).name)
sys.exit(0)
signal.signal(signal.SIGTERM, handle_shutdown)
signal.signal(signal.SIGINT, handle_shutdown).env must be in .gitignore — always.env.example with placeholder values (no real secrets)Always ensure .gitignore includes:
.env
.env.local
.env.production
.env.*.localWhen creating docker-compose files for multi-service applications:
WRONG — hardcoded values, no health checks, running as root:
services:
api:
build: .
ports:
- "3000:3000"
environment:
DATABASE_URL: "postgresql://admin:password123@db:5432/myapp"RIGHT — env vars, health checks, proper structure:
services:
api:
build:
context: .
target: production
ports:
- "${API_PORT:-3000}:3000"
environment:
- DATABASE_URL=${DATABASE_URL}
- NODE_ENV=production
depends_on:
db:
condition: service_healthy
healthcheck:
test: ["CMD", "wget", "--no-verbose", "--tries=1", "--spider", "http://localhost:3000/health"]
interval: 30s
timeout: 5s
retries: 3
restart: unless-stopped
db:
image: postgres:16-alpine
environment:
- POSTGRES_USER=${POSTGRES_USER}
- POSTGRES_PASSWORD=${POSTGRES_PASSWORD}
- POSTGRES_DB=${POSTGRES_DB}
volumes:
- postgres_data:/var/lib/postgresql/data
healthcheck:
test: ["CMD-SHELL", "pg_isready -U ${POSTGRES_USER}"]
interval: 10s
timeout: 5s
retries: 5
volumes:
postgres_data: