Reduce docker image size from 1GB to 150mb | Method to optimize docker images

Dipchand Yadav

18 Oct 2024 • 9 min read

Docker makes it easy to package software, but if you're not careful, your images can get bloated. Big images mean slower downloads, delayed deployments, and longer startup times—nobody wants that.

The main problem? We often add stuff we don't need: source code, build tools, and other junk that shouldn’t be in the final image. This leads to larger images, more vulnerabilities, and a messy setup.

But here's the good news: you can shrink that oversized 1 GB Docker image down to just 150 MB. Sound impossible? It’s not!

With multi-stage Dockerfiles, you can build small, efficient images that are faster, more secure, and optimized. In this blog, I'll show you how to use these techniques to keep your images clean and secure.

Ready to make your Docker images lean and mean? Let’s dive in!

Why Do You Need a Multi-Stage Dockerfile?

In a typical Dockerfile with only one stage, you’ll start with a base image, copy your source code, install dependencies, build the application, and finally package everything into a single image. Sounds simple, right? But this method often includes a lot of things you don’t actually need in the final image.

Let’s break it down:

Source Code: Once your application is built, you don’t need the source code anymore in the final image. However, in a single-stage Dockerfile, it often ends up there anyway.
Build Tools: Tools like make, gcc, Maven, or Node.js are great for building your application but are unnecessary for running it. Keeping these tools in your final image only increases its size.
Unnecessary Libraries: Many libraries or dependencies required during the build process are no longer needed once the application is compiled. These extra dependencies remain in the image and make it heavier.

Let’s consider a simple single-stage Dockerfile example:

# A single-stage Dockerfile for a Node.js app
FROM node:14
WORKDIR /app
# Copy source code
COPY . .
# Install dependencies
RUN npm install
# Expose port and start the app
EXPOSE 3000
CMD ["npm", "start"]

This Dockerfile does everything in one go: it pulls a base image, copies the source code, installs the dependencies, and starts the application. While this works, it has a few drawbacks:

The final image contains all the dependencies used to build the app.
It includes unnecessary source code files and Node.js libraries that are only required during development.
The image size is large, which makes deployments slower and less efficient.

In a real-world application, a Docker image built this way can be several hundred megabytes—or even over a gigabyte—because it includes everything from the build stage to the runtime, along with unneeded files.

This is where multi-stage Dockerfiles come in to solve the problem.

What is a Multi-Stage Dockerfile?

A multi-stage Dockerfile allows us to break the build process into multiple steps (or stages), each with a specific purpose. The beauty of this approach is that we can take only the necessary parts from each stage and leave the rest behind, resulting in a cleaner, smaller final image.

A multi-stage Dockerfile uses more than one FROM instruction, each defining a new stage. We can copy files from one stage to another using the COPY --from=<stage> instruction, allowing us to move the application’s built files (e.g., a compiled binary or packaged app) from the build stage to the final stage, without copying over the build tools or source code.

How It Works:

Let’s break it down with our previous Express.js Node application example:

Stage 1 (Build Stage): In the first stage, we focus on building the application. This stage includes all the heavy tools like compilers, build dependencies, and the source code.
Stage 2 (Final Image): In the second stage, we start fresh with a much lighter base image, like the node:14-slim image. Here, we only copy the necessary files from the build stage, such as the packaged application, into the final image.

Here’s the multi-stage Dockerfile again for reference:

# Stage 1: Build
FROM node:14 AS builder
WORKDIR /app
COPY package*.json ./
RUN npm install
COPY . .
# Stage 2: Package
FROM node:14-slim
WORKDIR /app
COPY --from=builder /app .
EXPOSE 3000
CMD ["npm", "start"]

By using multi-stage Dockerfiles, we are effectively splitting the build and runtime environments into separate layers. This ensures that only the bare minimum components necessary to run the application are included in the final image.

Benefits of Multi-Stage Dockerfiles

Reduced Image Size: In our example, the single-stage image was 865MB, but with the multi-stage Dockerfile, we brought it down to just 177MB. This massive reduction in size means faster downloads, quicker deployments, and less disk space usage.
Increased Security: A smaller image means fewer components, and fewer components mean fewer vulnerabilities. When you only include what’s essential, you reduce the attack surface of your Docker container. This is crucial in production environments.
Improved Performance: Smaller images are faster to download and start up, which improves the overall performance of your deployment pipeline. Whether you’re deploying to Kubernetes, Docker Swarm, or other platforms, multi-stage Dockerfiles speed up the process.
Easier Maintenance: Multi-stage Dockerfiles make it easier to manage your Docker builds. By clearly separating the build process from the runtime, you can easily update dependencies or change build configurations without affecting the final image.

Real-World Example: Express.js Application

Let’s take a simple Express.js Node application as an example. First, we’ll create the Docker image using a single-stage Dockerfile, and then optimize it with a multi-stage Dockerfile to see the impact.

Single-Stage Dockerfile

Here’s the Dockerfile we used:

# Single-stage Dockerfile for Express.js app
FROM node:14
# Set the working directory
WORKDIR /app
# Copy the package.json and install dependencies
COPY package*.json ./
RUN npm install
# Copy the source code
COPY . .
# Expose the port and run the app
EXPOSE 3000
CMD ["npm", "start"]

This works, but the image contains unnecessary files like build tools, source code, and development dependencies. After building the image, it resulted in a size of 865MB.

Multi-Stage Dockerfile

Now let’s optimize it using a multi-stage Dockerfile:

# Stage 1: Build
FROM node:14 AS builder
WORKDIR /app
COPY package*.json ./
RUN npm install
COPY . .
# Stage 2: Package
FROM node:14-slim
WORKDIR /app
COPY --from=builder /app .
EXPOSE 3000
CMD ["npm", "start"]

With this Dockerfile, we separate the build process and only copy the necessary components into the final image. As a result, the image size is drastically reduced to 177MB.

Comparison of Image Sizes:

docker images
REPOSITORY      TAG        IMAGE ID       CREATED          SIZE
multi-stage     latest     e2357ff07336   8 minutes ago    177MB
single-stage    latest     fbe2938d61fd   10 minutes ago   865MB

By using a multi-stage Dockerfile, we reduced the image size from 865MB to 177MB, a 5x reduction! This optimization not only saves space but also speeds up deployments and reduces the attack surface by removing unnecessary components.

Advanced Use: Multi-Stage Dockerfiles with More than Two Stages

While two stages are commonly used, you can use more than two stages in a multi-stage Dockerfile. This is especially useful when you have multiple build steps or when you want to run tests in between.

For example, in a CI/CD pipeline, you might have a three-stage Dockerfile that builds, tests, and then packages the application.

Here’s an example of a three-stage Dockerfile:

# Stage 1: Build
FROM node:14 AS builder
WORKDIR /app
COPY package*.json ./
RUN npm install
COPY . .
# Stage 2: Test
FROM builder AS tester
RUN npm test
# Stage 3: Package
FROM node:14-slim
WORKDIR /app
COPY --from=builder /app .
EXPOSE 3000
CMD ["npm", "start"]

In this example, the second stage is used for testing. This approach ensures that the build passes all tests before packaging it into a production-ready image. You can target specific stages during the build

process by using Docker’s --target option, which allows you to build up to a specific stage.

Using Alpine Images to Further Reduce Image Size

One of the easiest ways to reduce Docker image size is by using lightweight base images like Alpine. Alpine is a minimal Docker image based on Alpine Linux, which is designed for security and small size. It’s commonly used to reduce the size of images significantly without sacrificing functionality.

Why Use Alpine?

Extremely Small Size: The Alpine image is only about 5 MB in size, compared to hundreds of megabytes for traditional Linux distributions like Ubuntu or CentOS.
Improved Security: Since Alpine has fewer packages, there’s a smaller attack surface and fewer vulnerabilities to worry about.
Faster Deployments: Smaller images mean faster download times, which can make a big difference when deploying your containers across multiple environments.

Example of Using Alpine

Let’s revisit our multi-stage Dockerfile and replace the base images with Alpine to further reduce the image size:

# Stage 1: Build
FROM node:14-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm install
COPY . .
# Stage 2: Package
FROM node:14-alpine
WORKDIR /app
COPY --from=builder /app .
EXPOSE 3000
CMD ["npm", "start"]

In this Dockerfile, we’re using node:14-alpine instead of the standard Node.js image. The Alpine version of Node.js is significantly smaller than the default version because it includes only the essential components.

Comparison of Sizes

By switching to Alpine, you can reduce the image size even further. For example, a typical node:14 image might be over 180 MB, but the node:14-alpine image is around 90 MB. When applied across multi-stage builds, this can save hundreds of megabytes.

Security with Alpine

While Alpine is a great option for reducing image size, it’s important to ensure that any additional packages you install are from trusted sources. Alpine uses apk as its package manager, and you should always clean up temporary files after installation to avoid unnecessary layers and potential vulnerabilities.

Security Best Practices for Docker Images

When building Docker images, it’s not just about reducing size; it’s also about security. A smaller image often means fewer attack surfaces, but there are additional security practices you should follow to ensure your Docker images are as secure as possible.

1. Use Official and Verified Base Images

Always start with a trusted base image, ideally from the Docker Official Images or images that are regularly maintained and updated. Verified images reduce the risk of introducing vulnerabilities through third-party sources.

2. Minimize the Number of Layers

Each layer in your Dockerfile adds complexity and potentially more attack surfaces. Combine commands where possible to reduce the number of layers. For example, instead of writing separate RUN commands for installing dependencies and cleaning up afterward, you can combine them:

RUN apt-get update && apt-get install -y \
    curl \
    && apt-get clean \
    && rm -rf /var/lib/apt/lists/*

3. Use Multi-Stage Dockerfiles

As discussed earlier, multi-stage Dockerfiles are a great way to ensure that build tools and source code don’t end up in your final image. Only copy the final artifacts into your production image, ensuring the smallest and cleanest image possible.

4. Keep the Image Updated

Always keep your Docker images up to date by regularly pulling the latest versions of base images and dependencies. Vulnerabilities are constantly being discovered and patched, so it’s crucial to update your images frequently.

5. Use Docker Secrets

Never hard-code sensitive information like passwords or API keys directly into your Dockerfile. Instead, use Docker secrets or environment variables to handle sensitive information securely.

Conclusion

Optimizing Docker images is essential for ensuring fast deployments, efficient storage, and better security. By using multi-stage Dockerfiles, you can drastically reduce the size of your images and ensure that only the essential components are included in the final build. To take this optimization even further, using Alpine as a base image can shave off even more megabytes and reduce the potential attack surface.

In addition to these techniques, following Docker security best practices—such as using official base images, minimizing layers, and keeping images updated—will help ensure that your containers are secure, lightweight, and performant in production.

At Kubenine, we take care of all these tasks for you—from building Docker images, handling deployments, to setting up and managing infrastructure. Leave all of this to us so that you can focus on what really matters: your product.

Questions & Answers

Why is the image size important in Docker?
- Image size affects the time it takes to pull and start a container, as well as the amount of storage required. Smaller images are faster to download, take up less disk space, and are easier to manage in production environments.
What are multi-stage Dockerfiles, and why should I use them?
- Multi-stage Dockerfiles break the build process into multiple stages, allowing you to include only the necessary components in the final image. This helps reduce image size and improve security by removing unnecessary files like build tools and source code.
How do Alpine images reduce Docker image size?
- Alpine is a lightweight Linux distribution that reduces the base image size significantly. It includes only essential components, which makes your Docker images much smaller compared to using full-fledged Linux distributions like Ubuntu or CentOS.
What are some common Docker security best practices?
- Some best practices include using official base images, minimizing layers, keeping images updated, and avoiding hard-coding sensitive information into Dockerfiles by using Docker secrets or environment variables.
How can I further optimize my Docker images?
- Beyond using multi-stage Dockerfiles and Alpine, you can combine commands to reduce layers, keep your images regularly updated, and limit the number of packages you install to reduce both image size and potential security vulnerabilities.

Checkout another insightful article on Docker by us: https://www.kubeblogs.com/the-pitfalls-of-docker-logging-common-mistakes-and-best-practices-2/

On this page