A Comprehensive Guide to Writing Dockerfiles


A Dockerfile is a text document containing instructions to build a Docker image automatically. This article covers the essential aspects of writing effective Dockerfiles, including best practices and common commands.

Basic Structure and Commands

  1. FROM Instruction

The FROM instruction initializes a new build stage and sets the base image. It’s typically the first instruction in a Dockerfile:


FROM ubuntu:20.04

FROM python:3.9-slim
  1. WORKDIR Instruction

WORKDIR sets the working directory for subsequent instructions:


WORKDIR /app
  1. COPY and ADD Instructions

These instructions copy files from the host to the container:


COPY . /app/

ADD file.tar.gz /app/

COPY is preferred for simple file copying, while ADD has additional features like tar extraction and URL support.

  1. RUN Instruction

RUN executes commands in a new layer:


RUN apt-get update && \

apt-get install -y python3 && \

rm -rf /var/lib/apt/lists/*
  1. ENV Instruction

ENV sets environment variables:


ENV PATH=/usr/local/bin:$PATH

ENV APP_HOME=/app
  1. EXPOSE Instruction

EXPOSE informs Docker that the container listens on specified network ports:


EXPOSE 8080
  1. CMD and ENTRYPOINT Instructions

These specify the default command to run when starting a container:


ENTRYPOINT ["python"]

CMD ["app.py"]

Best Practices

  1. Layer Optimization
  • Combine related commands using && to reduce layers

  • Place frequently changing instructions later in the Dockerfile

  • Use .dockerignore to exclude unnecessary files

  1. Base Image Selection
  • Choose official images when possible

  • Use specific tags instead of ’latest'

  • Consider slim variants for smaller image sizes

  1. Security Considerations
  • Avoid running containers as root

  • Use multi-stage builds to reduce attack surface

  • Regularly update base images

  1. Multi-stage Builds

Example of a multi-stage build:


# Build stage

FROM golang:1.16 AS builder

WORKDIR /app

COPY . .

RUN go build -o main

# Final stage

FROM alpine:3.14

COPY --from=builder /app/main /usr/local/bin/

CMD ["main"]
  1. Caching Considerations
  • Place instructions that change frequently (like COPY . .) later in the Dockerfile

  • Use specific COPY instructions instead of COPY . when possible

  • Leverage build cache effectively

Example of a Complete Dockerfile


# Use official base image

FROM python:3.9-slim

# Set working directory

WORKDIR /app

# Set environment variables

ENV PYTHONUNBUFFERED=1 \

PYTHONDONTWRITEBYTECODE=1

# Install system dependencies

RUN apt-get update && \

apt-get install -y --no-install-recommends \

gcc \

&& rm -rf /var/lib/apt/lists/*

# Install Python dependencies

COPY requirements.txt .

RUN pip install --no-cache-dir -r requirements.txt

# Copy application code

COPY . .

# Create non-root user

RUN useradd -m appuser && \

chown -R appuser:appuser /app

USER appuser

# Expose port

EXPOSE 8000

# Set entry point

CMD ["python", "manage.py", "runserver", "0.0.0.0:8000"]

Common Issues and Solutions

  1. Image Size
  • Use multi-stage builds

  • Clean up package manager caches

  • Use smaller base images when possible

  1. Build Time
  • Optimize layer caching

  • Use .dockerignore effectively

  • Combine related RUN commands

  1. Security
  • Scan images for vulnerabilities

  • Update base images regularly

  • Follow principle of least privilege

Maintenance Considerations

  1. Documentation
  • Include comments in Dockerfile

  • Maintain a README with build and run instructions

  • Document environment variables

  1. Version Control
  • Tag images appropriately

  • Use semantic versioning

  • Maintain changelog

  1. Testing
  • Test builds in CI/CD pipeline

  • Verify functionality in different environments

  • Implement automated testing

This guide covers the fundamental aspects of writing Dockerfiles. Regular practice and staying updated with Docker’s evolving best practices will help in creating more efficient and secure container images.

Remember to adapt these guidelines based on specific use cases and requirements, as different applications may need different approaches to containerization.