Distroless images 101: What is it, why it is important?: Distroless is a Docker image that is published by Google, it basically allows you to eliminate the OS in the containers to the bare minimum application that needs to run. Basically, it will package only application + dependencies in a container image, this will ultimately reduce the image size and also attack surface.
Why should I use distroless images?
From the point of view of
- Security, after successfully building the image, you will only get docker images – minus operating system and it is a good way to improve the security posture of containers. Meaning that there will be no package managers, shells or mkdir, or chmod, or any other programs you would expect to find in a standard Linux distribution.
- Reduced Vulnerabilities: On the other side of the story, say if we have all binaries packed in the image we have to track the source and get the vulnerabilities scanned often, the more the binaries the more system prone to the attack and undiscovered vulnerabilities. It is always good to run the container with a really light footprint and Distroless will help to reduce vulnerabilities.
- Reduced Cost: It will reduce the cost as compared to the other strategy we use today to build images (Except Alpine). Say if we are using larger images or downloading images from various sources and get it deployed to thousand of clusters and VM’s it will result in increased cost for storage and usage.
How do I use distroless images?
You can find the source code of the project in <<https://github.com/GoogleContainerTools/distroless>>
FROM node:10.19.0 AS build-env
ADD . /app
WORKDIR /app
FROM gcr.io/distroless/nodejs
COPY --from=build-env /app /app
WORKDIR /app
CMD ["hello.js"]
hello.js
console.log("Hello World");
docker build -t helloworld .
Sending build context to Docker daemon 377.3kB
Step 1/7 : FROM node:10.19.0 AS build-env
10.19.0: Pulling from library/node
c0c53f743a40: Pull complete
66997431d390: Pull complete
0ea865e2909f: Pull complete
584bf23912b7: Pull complete
3c4c73959f29: Pull complete
63e05266fc4b: Pull complete
1f4961ce4444: Pull complete
6b0e52f69879: Pull complete
3ed75ed173e8: Pull complete
Digest: sha256:df200903ff34c07c1b9112b4fd9d1342c11eb7d99525f2b366c487f91dda8131
Status: Downloaded newer image for node:10.19.0
---> aa6432763c11
Step 2/7 : ADD . /app
---> 5a27ce38013a
Step 3/7 : WORKDIR /app
---> Running in 969453b8e74b
Removing intermediate container 969453b8e74b
---> 0e69dd8e547c
Step 4/7 : FROM gcr.io/distroless/nodejs
---> a7c04de24e19
Step 5/7 : COPY --from=build-env /app /app
---> 963f2d3c9fe4
Step 6/7 : WORKDIR /app
---> Running in e3294169dcac
Removing intermediate container e3294169dcac
---> f6cd56081736
Step 7/7 : CMD ["hello.js"]
---> Running in 324ad1001479
Removing intermediate container 324ad1001479
---> eddbd2526c69
Successfully built eddbd2526c69
Successfully tagged helloworld:latest
How can you debug a container if there is no shell?
Try logging in
docker run --rm -it --entrypoint /bin/bash hello-world
docker: Error response from daemon: OCI runtime create failed: container_linux.go:346: starting container process caused "exec: \"/bin/bash\": stat /bin/bash: no such file or directory": unknown.
Since we don’t have any other programs in distroless images which will help us to debug/troubleshoot in case of production/UAT issues. We have below workaround for debugging.
- Use Distroless busybox shell for debugging
- Have a workaround by publishing a debug image (which contains a shell) so we can exec into the container and do something (e.g. a thread dump)
- For people who are using google cloud you can explore <<Ephemeral >> containers.
Compare docker image size nodejs distroless vs nodejs Alphine?
Image size:
REPOSITORY - SIZE
10.19.0-alpine - 83.5MB
gcr.io/distroless/nodejs - 81.2MB
As a result, distroless has less capacity and almost the same level as alphine.
Distroless vs Alpine
Distroless and alpine images can both provide very small base images with less memory, better performance. Which one should we use in the production environment? Considering security in the production environment distroless is the best choice to implement.
Conclusion
Overall distroless will have a very light footprint of packaged application that ultimately contains only binary files that it can run. It also reduces the attack surface and cost. At present, the following images are released from Distroless.
- gcr.io/distroless/base (the execution environment for binaries such as Golang)
- gcr.io/distroless/java (based on openjdk8)
- gcr.io/distroless/cc (for Rust, D language)
- gcr.io/distroless/python2.7
- gcr.io/distroless/python3
- gcr.io/distroless/nodejs
- gcr.io/distroless/java/jetty
- gcr.io/distroless/dotnet