name:: devops-essentials
description:: Every production-bound application MUST follow DevOps best practices from day one. This includes multi-stage Dockerfiles with non-root users, proper .dockerignore files, cache-efficient layer ordering, health checks, CI/CD pipelines with separate build/test/deploy stages, 12-factor configuration via environment variables, graceful shutdown handling, and secrets management. Apply this whenever creating Dockerfiles, docker-compose configurations, CI/CD pipelines, or deploying any application.
keywords:: dockerfile, docker, multi-stage build, non-root user, dockerignore, layer caching, healthcheck, ci/cd, github actions, gitlab ci, deployment, environment variables, 12-factor, graceful shutdown, sigterm, secrets, docker-compose, container security, zero-downtime, deployment pipeline
license:: MIT

DevOps Essentials — Non-Negotiable for Every Production Application

Name: tessl-labs/devops-essentials
Rating: 89.60000000000001 (1 reviews)
Author: tessl-labs

Any time you create a Dockerfile, docker-compose file, CI/CD pipeline, or prepare an application for deployment, you MUST follow these DevOps best practices. There is no exception.

This is not optional. This is not something to add later. These practices are part of building a production application, just like error handling and testing.

Dockerfile Best Practices

1. Always Use Multi-Stage Builds

Single-stage builds ship compilers, dev dependencies, and build tools into production. This creates bloated, insecure images.

WRONG — single-stage build with dev dependencies in production:

FROM node:20
WORKDIR /app
COPY . .
RUN npm install
RUN npm run build
CMD ["node", "dist/index.js"]

RIGHT — multi-stage build, minimal production image:

# Build stage
FROM node:20-alpine AS build
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
RUN npm run build

# Production stage
FROM node:20-alpine AS production
WORKDIR /app
COPY --from=build /app/dist ./dist
COPY --from=build /app/node_modules ./node_modules
COPY --from=build /app/package*.json ./
ENV NODE_ENV=production
CMD ["node", "dist/index.js"]

Python example — multi-stage:

# Build stage
FROM python:3.12-slim AS build
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir --prefix=/install -r requirements.txt
COPY . .

# Production stage
FROM python:3.12-slim AS production
WORKDIR /app
COPY --from=build /install /usr/local
COPY --from=build /app .
CMD ["python", "-m", "uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]

2. Always Run as Non-Root User

Containers run as root by default. A container escape with root access compromises the host. Always create and switch to a non-root user.

WRONG — running as root (the default):

FROM node:20-alpine
WORKDIR /app
COPY . .
CMD ["node", "index.js"]

RIGHT — non-root user:

FROM node:20-alpine AS production
WORKDIR /app
COPY --from=build /app/dist ./dist
COPY --from=build /app/node_modules ./node_modules
COPY --from=build /app/package*.json ./

# Create non-root user and set ownership
RUN addgroup -S appgroup && adduser -S appuser -G appgroup
RUN chown -R appuser:appgroup /app
USER appuser

CMD ["node", "dist/index.js"]

Python example — non-root:

FROM python:3.12-slim AS production
WORKDIR /app
COPY --from=build /install /usr/local
COPY --from=build /app .

RUN groupadd -r appgroup && useradd -r -g appgroup appuser
RUN chown -R appuser:appgroup /app
USER appuser

CMD ["python", "-m", "uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]

3. Always Create a .dockerignore File — EVERY TIME You Create a Dockerfile

This is the most commonly forgotten step. Without .dockerignore, Docker copies everything including node_modules, .git, .env files, and test artifacts into the build context. This bloats the image, leaks secrets, and breaks builds.

CRITICAL: Every time you create a Dockerfile, you MUST also create a .dockerignore file in the same directory (or project root). This is not optional. A Dockerfile without a .dockerignore is incomplete.

Always create a .dockerignore file with at minimum:

node_modules
.git
.gitignore
.env
.env.*
*.md
.vscode
.idea
dist
coverage
.nyc_output
__pycache__
*.pyc
.pytest_cache
.venv
venv

4. Order Layers for Cache Efficiency

Docker caches each layer. When a layer changes, all subsequent layers are invalidated. Copy dependency files first, install, then copy source code. This way source code changes don't re-trigger dependency installation.

WRONG — copies everything first, cache busted on every source change:

FROM node:20-alpine
WORKDIR /app
COPY . .
RUN npm install
CMD ["node", "index.js"]

RIGHT — dependency files first, then install, then source:

FROM node:20-alpine AS build
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
RUN npm run build

Python example — cache-efficient:

FROM python:3.12-slim AS build
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .

5. Always Add a HEALTHCHECK Instruction

Without a HEALTHCHECK, Docker and orchestrators cannot determine if your application is actually serving traffic or is stuck in a broken state.

HEALTHCHECK --interval=30s --timeout=5s --start-period=10s --retries=3 \
  CMD wget --no-verbose --tries=1 --spider http://localhost:3000/health || exit 1

Or with curl:

HEALTHCHECK --interval=30s --timeout=5s --start-period=10s --retries=3 \
  CMD curl -f http://localhost:8000/health || exit 1

6. Use Specific Base Image Tags

Never use :latest — it causes unpredictable builds and breaks reproducibility.

WRONG:

FROM node:latest
FROM python:latest

RIGHT:

FROM node:20-alpine
FROM python:3.12-slim

CI/CD Pipeline Patterns

Separate Build, Test, and Deploy Stages

Every CI/CD pipeline must have distinct stages. Tests MUST pass before deployment. Never combine build and deploy into a single step.

GitHub Actions — RIGHT:

name: CI/CD

on:
  push:
    branches: [main]
  pull_request:
    branches: [main]

jobs:
  lint:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: 20
          cache: 'npm'
      - run: npm ci
      - run: npm run lint
      - run: npm run type-check

  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: 20
          cache: 'npm'
      - run: npm ci
      - run: npm test

  deploy:
    needs: [lint, test]
    if: github.ref == 'refs/heads/main'
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Deploy
        run: echo "Deploy step"
        env:
          DEPLOY_TOKEN: ${{ secrets.DEPLOY_TOKEN }}

Key rules:

Cache dependencies between runs (actions/setup-node with cache, pip cache, etc.)
Lint + type-check + test as gates — deploy only if all pass
Deploy only on main branch — not on pull requests
Use needs to enforce stage ordering

Never Put Secrets in CI/CD Config Files

WRONG — secrets hardcoded or in build args:

env:
  DATABASE_URL: "postgresql://user:password@db.example.com/prod"
  API_KEY: "sk-abc123"

RIGHT — use repository secrets and environment variables:

env:
  DATABASE_URL: ${{ secrets.DATABASE_URL }}
  API_KEY: ${{ secrets.API_KEY }}

Deployment Configuration

12-Factor App — Environment Variables for All Config

All configuration must come from environment variables. Never hardcode connection strings, API keys, ports, or feature flags.

WRONG — hardcoded config:

const db = new Pool({
  host: 'db.production.example.com',
  port: 5432,
  password: 'supersecret',
});
const port = 3000;

RIGHT — environment variables with validation:

const requiredEnv = (name: string): string => {
  const value = process.env[name];
  if (!value) throw new Error(`Missing required environment variable: ${name}`);
  return value;
};

const config = {
  port: parseInt(process.env.PORT || '3000', 10),
  databaseUrl: requiredEnv('DATABASE_URL'),
  nodeEnv: process.env.NODE_ENV || 'development',
};

Python example:

import os

def required_env(name: str) -> str:
    value = os.environ.get(name)
    if not value:
        raise ValueError(f"Missing required environment variable: {name}")
    return value

DATABASE_URL = required_env("DATABASE_URL")
PORT = int(os.environ.get("PORT", "8000"))

Graceful Shutdown — Handle SIGTERM

When a container is stopped or a new version is deployed, the orchestrator sends SIGTERM. Your application must handle it to finish in-flight requests and close connections cleanly. Without this, requests are dropped during deployments.

Node.js — graceful shutdown:

const server = app.listen(config.port, () => {
  logger.info({ port: config.port }, 'server_started');
});

function gracefulShutdown(signal: string) {
  logger.info({ signal }, 'shutdown_signal_received');
  server.close(() => {
    logger.info('server_closed');
    process.exit(0);
  });
  // Force shutdown after timeout
  setTimeout(() => {
    logger.error('forced_shutdown');
    process.exit(1);
  }, 10_000);
}

process.on('SIGTERM', () => gracefulShutdown('SIGTERM'));
process.on('SIGINT', () => gracefulShutdown('SIGINT'));

Python (uvicorn/FastAPI) — graceful shutdown:

import signal
import sys

def handle_shutdown(signum, frame):
    logger.info("shutdown_signal_received", signal=signal.Signals(signum).name)
    sys.exit(0)

signal.signal(signal.SIGTERM, handle_shutdown)
signal.signal(signal.SIGINT, handle_shutdown)

Never Commit Secrets

.env must be in .gitignore — always
Provide a .env.example with placeholder values (no real secrets)
Use secret managers (AWS Secrets Manager, Vault, Doppler) in production
CI/CD secrets go in repository/environment secrets, never in config files

Always ensure .gitignore includes:

.env
.env.local
.env.production
.env.*.local

Docker Compose Best Practices

When creating docker-compose files for multi-service applications:

WRONG — hardcoded values, no health checks, running as root:

services:
  api:
    build: .
    ports:
      - "3000:3000"
    environment:
      DATABASE_URL: "postgresql://admin:password123@db:5432/myapp"

RIGHT — env vars, health checks, proper structure:

services:
  api:
    build:
      context: .
      target: production
    ports:
      - "${API_PORT:-3000}:3000"
    environment:
      - DATABASE_URL=${DATABASE_URL}
      - NODE_ENV=production
    depends_on:
      db:
        condition: service_healthy
    healthcheck:
      test: ["CMD", "wget", "--no-verbose", "--tries=1", "--spider", "http://localhost:3000/health"]
      interval: 30s
      timeout: 5s
      retries: 3
    restart: unless-stopped

  db:
    image: postgres:16-alpine
    environment:
      - POSTGRES_USER=${POSTGRES_USER}
      - POSTGRES_PASSWORD=${POSTGRES_PASSWORD}
      - POSTGRES_DB=${POSTGRES_DB}
    volumes:
      - postgres_data:/var/lib/postgresql/data
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U ${POSTGRES_USER}"]
      interval: 10s
      timeout: 5s
      retries: 5

volumes:
  postgres_data:

Checklist — Apply Every Time

Verifiers

dockerfile-best-practices — Multi-stage builds, non-root user, .dockerignore, layer ordering, HEALTHCHECK
cicd-pipeline-patterns — Separate stages, test before deploy, cached dependencies, secrets management
deployment-config — Environment variables, graceful shutdown, secrets management

evals

skills

devops-essentials

SKILL.md

verifiers

tile.json

tessl-labs/devops-essentials

SKILL.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}skills/devops-essentials/