DOCKCourse

Docker in Production

Lessons8modules
Total80mfull study
Quick7mtrailer
Projects8docker labs
CHEATSHEET · 01Docker in Production · operations cheatsheet
Dockerfile best practices
  • ·Multi-stage: build stage → runtime stage, copy only artifacts needed
  • ·Layer caching: order FROM, RUN package-manager, COPY app code last
  • ·Non-root USER: RUN useradd -m app && USER app (no sudo in container)
  • ·Minimal base: distroless or alpine, not ubuntu/debian full OS
  • ·.dockerignore: exclude node_modules, .git, .env before COPY
  • ·HEALTHCHECK: curl localhost:8080/health || exit 1 (for orchestrators)
Docker Compose for multi-service apps
  • ·services: define each container, image/build, ports, environment, volumes
  • ·depends_on: service_healthy waits for healthcheck, not just startup
  • ·Named volumes: db_data: {} persists across container restarts
  • ·env_file: load secrets from .env.prod (never commit secrets)
  • ·override: docker-compose.override.yml for local dev, ignored in prod
  • ·networks: user-defined bridge gives DNS discovery by service name
Image tagging and registry workflow
  • ·Semver tags: v1.2.3 for releases, latest for bleeding edge (risky)
  • ·SHA digest: docker pull image@sha256:abc... pins exact build, no surprises
  • ·GHCR login: echo $GH_TOKEN | docker login ghcr.io -u USERNAME --password-stdin
  • ·Tag before push: docker tag myapp:latest ghcr.io/user/myapp:v1.2.3
  • ·Multi-platform: docker buildx build --platform linux/amd64,arm64 -t image .
  • ·Scan before push: docker scout cves image (or in CI before registry push)
Container security hardening
  • ·Read-only root: docker run --read-only --tmpfs /tmp (prevent tampering)
  • ·Drop capabilities: --cap-drop=ALL --cap-add=NET_BIND_SERVICE (least privilege)
  • ·No new privileges: --security-opt=no-new-privileges (prevent privilege escalation)
  • ·Docker Scout: docker scout cves image, fix HIGH/CRITICAL before ship
  • ·Distroless base: FROM gcr.io/distroless/base (no shell, no package manager)
  • ·Secrets: docker secret create (Swarm) or env_file + .gitignore (Compose)
Volumes and data persistence
  • ·Named volume: volumes: db_data: {} survives container rm, portable
  • ·Bind mount: volumes: - ./data:/app/data (dev only, ties to host path)
  • ·Volume driver: volumes: db_data: {driver: local} for NFS/cloud storage
  • ·Backup: docker run --rm -v db_data:/data -v $(pwd):/backup alpine tar czf /backup/db.tar.gz /data
  • ·Restore: docker run --rm -v db_data:/data -v $(pwd):/backup alpine tar xzf /backup/db.tar.gz -C /
  • ·Inspect: docker volume inspect db_data (shows Mountpoint on host)
Observability stack (Prometheus + Grafana + cAdvisor)
  • ·cAdvisor: docker run --volume=/:/rootfs:ro --volume=/var/run:/var/run:ro gcr.io/cadvisor/cadvisor
  • ·Prometheus: scrape_configs: - job_name: cadvisor, targets: [cadvisor:8080]
  • ·Grafana: Add Prometheus datasource, import dashboard ID 14981 (cAdvisor)
  • ·Alert rule: expr: container_memory_usage_bytes > 1e9, for: 5m (memory breach)
  • ·Metrics to watch: container_cpu_usage_seconds_total, container_memory_usage_bytes, container_network_receive_bytes_total
  • ·Compose stack: all four services (app, cAdvisor, Prometheus, Grafana) in one docker-compose.yml
CHEATSHEET · 02Docker in Production · 2 AM debugging cheatsheet
Container won't start or crashes immediately
  • ·docker logs <container> | tail -50 — check stderr/stdout for app errors
  • ·docker inspect <container> | grep -A5 State — verify ExitCode (0=ok, 1=app error, 137=OOM)
  • ·docker run --rm -it <image> /bin/sh — test image interactively, verify base OS exists
  • ·docker ps -a | grep <name> — confirm container exists; docker rm if stuck
  • ·Check ENTRYPOINT/CMD syntax in Dockerfile — shell vs exec form matters (PID 1 signals)
Image build fails or bloats unexpectedly
  • ·docker build --progress=plain . 2>&1 | grep -i error — see exact layer that failed
  • ·docker history <image> — inspect layer sizes; look for >100MB jumps
  • ·Rebuild with --no-cache if layer cache is stale (apt-get, npm install, etc.)
  • ·Check for RUN apt-get clean && rm -rf /var/lib/apt/lists/* — missing cleanup bloats
  • ·Verify multi-stage FROM — ensure final stage doesn't copy /var/cache or build artifacts
Compose stack networking or service discovery broken
  • ·docker compose ps — verify all services are Up; check Port column for port conflicts
  • ·docker compose logs <service> — check for DNS resolution errors (getaddrinfo ENOTFOUND)
  • ·docker compose exec <service> ping <other-service> — test DNS from inside container
  • ·Verify service names match compose file; DNS only works in user-defined networks
  • ·Check depends_on: <service> — does not wait for readiness; add healthcheck instead
Volume or data persistence issues
  • ·docker volume ls | grep <name> — confirm named volume exists; docker volume inspect for path
  • ·docker run -v <vol>:/data <image> ls -la /data — verify mount point and permissions
  • ·Check volume driver (local vs. nfs); docker volume inspect shows Driver field
  • ·Bind mounts: verify host path exists and has correct ownership (docker runs as root by default)
  • ·docker compose down does NOT delete named volumes — use docker volume rm or docker compose down -v
Image security scan or CVE alerts firing
  • ·docker scout cves <image> — run Scout locally; filter by CRITICAL/HIGH
  • ·docker scout recommendations <image> — suggests base image upgrades and fixes
  • ·Check Dockerfile: is base image pinned to digest (not :latest)? Use distroless if possible
  • ·docker run --read-only <image> — test read-only filesystem; app may need /tmp or /var/tmp
  • ·Verify USER is non-root (docker inspect <image> | grep User); root containers are CVE magnet
CI/CD pipeline or registry push fails
  • ·docker login -u <user> <registry> — verify credentials; check ~/.docker/config.json
  • ·docker tag <image> <registry>/<repo>:<tag> — confirm tag format before push
  • ·docker push <registry>/<repo>:<tag> — check for 401 Unauthorized (auth) or 403 Forbidden (perms)
  • ·GitHub Actions: verify GITHUB_TOKEN has packages:write scope; use ghcr.io/owner/repo (lowercase)
  • ·docker pull <digest> — if push succeeded but pull fails, check image manifest and platform tags