When it comes to space efficiency Docker still isn’t quite as good as it could be. The layered filesystem used by Docker sometimes occupies more space than is really necessary. Over time, a couple of enhancements have made their way into Docker to allow the build of more space efficient images. The
ADD
instruction for example is smart enough to detect (unfortunately only) local compressed archives (and unfortunately only identity
, gzip
, bzip2
or xz
but not zip
archives) and directly adds the uncompressed contents into the image, unlike the COPY
instruction that simply copies a file into the image. The downside of the latter is that space is occupied for the zip archive itself and the extracted contents of the archive, whereas the former will not occupy any space for the archive as it is never added into the image.
Docker multi-stage builds
With Docker 17.05-ce we got multi-stage builds which allow you to, for example, build a binary in an intermediate image and then just copy said built binary over to another, the final image. The benefit here is that all the occupied space from the source files, build tools, etc. never make it into the final image but remain in the intermediate image only. This is great for micro containers and has become especially popular in the Go environment of statically linked single binaries. However, multi-stage builds soon hit limitations when one installs software pieces in various places. Common examples are installing regular yum packages that make it into potentially many different locations on the filesystem, creating users and groups, etc. In that scenario one may quickly end up having to copy the entire filesystem, i.e. /
, over to the final container. The only problem with that is that the file ownership is lost, i.e. all files end up being owned by root again. This then leads to additional RUN
instructions in the final image which then cause yet another layer built and potentially space wasted again. A simple example of this can be seen in a little experiment of mine with using multi-stage builds for an Oracle database installation. In that experiment I did exactly what I said above, copy over the entire filesystem via following instruction:
COPY --from=builder / /
This instruction just copies everything (/
) from the intermediate image builder over to /
of the final image. However, as all files after the copy are owned by root
one additional step is necessary: Changing the ownership of $ORACLE_BASE
to the oracle
user, which is done via following instruction:
RUN chown -R oracle:dba $ORACLE_BASE
While the COPY
instruction did exactly what was hoped for, just copying the final results after a successful install, the additional RUN
instruction unfortunately added yet again unnecessary size to the resulting image (see line 8):
[oracle@localhost dockerfiles]$ docker history oracle/database:12.2.0.1-ee IMAGE CREATED CREATED BY SIZE COMMENT 0328c9d11729 4 minutes ago /bin/sh -c #(nop) CMD ["/bin/sh" "-c" "ex... 0B 3847d825c5e0 5 minutes ago /bin/sh -c #(nop) EXPOSE 1521/tcp 5500/tcp 0B 3293438499a2 5 minutes ago /bin/sh -c #(nop) VOLUME [/opt/oracle/ora... 0B aa9294c8a9e1 5 minutes ago /bin/sh -c #(nop) WORKDIR /home/oracle 0B 2b1ce0fb41e2 5 minutes ago /bin/sh -c #(nop) USER [oracle] 0B a4a030e72a3d 5 minutes ago /bin/sh -c chown -R oracle:dba $ORACLE_BASE 5.97GB 5c36909d4568 6 minutes ago /bin/sh -c #(nop) COPY dir:bf7a903b4add9a7... 6.14GB 87ba2e289389 9 minutes ago /bin/sh -c #(nop) ENV PATH=/opt/oracle/pr... 0B 3f1b0c5412bf 9 minutes ago /bin/sh -c #(nop) ENV ORACLE_BASE=/opt/or... 0B a6e9e9e5ddc1 12 days ago /bin/sh -c #(nop) CMD ["/bin/bash"] 0B 12 days ago /bin/sh -c #(nop) ADD file:7f7d89f7bdf05fd... 118MB 4 weeks ago /bin/sh -c #(nop) MAINTAINER Oracle Linux... 0B
Of course copying over the entire filesystem it not necessarily what you would want to do but the experiment shows that unfortunately multi-stage builds still lack certain flexibilities that hopefully will soon be addressed. However, if you add just a few or perhaps even just one file that you have to, e.g. build from its source first, multi-stage build are a great way to keep your Docker image size down.
Update
Since Docker version 17.09.0-ce the COPY
and ADD
commands now support the --chown
option, making it possible to copy files into the image or another stage with the desired user and group IDs. With that option applied, the above additional step of manually chowning the copied files is no longer needed and hence no new layer will be created. This came as part of moby/moby#34263.
Docker –squash option
Luckily, there is another feature that can also help to shrink the size of a Docker image: Squashing all filesystem layers into a single one. I found that this feature is still widely unknown to the world, most likely because it is still an experimental feature. The squash feature was introduced with Docker 1.13 (API 1.25), back when Docker still followed a different versioning scheme. And as said, still today with Docker 17.06-ce it is classified as experimental, which comes with a caveat of course. Although I never encountered any issues with the --squash
option, in the early days it was known to break images once in a while. As said, I haven’t encountered that in the recent versions anymore but you know, be aware that it is still classified as experimental.
Enable experimental features
In order to use experimental features in Docker you first have to turn those experimental features on, which means you have to pass the --experimental
flag on to the Docker daemon. This is rather easy for Docker on Mac, just go to Preferences -> Daemon and check the Experimental features checkbox. Apply & Restart does the rest of the trick. On Linux however, you will have to modify /etc/docker/daemon.json
yourself and include "experimental": true
(important, no quotes for true
!). Doing so, my file looks like following:
[root@localhost ~]$ cat /etc/docker/daemon.json { "experimental": true, "storage-driver": "btrfs" }
After restarting the Docker daemon you will now spot following line Experimental: true
when running docker info
(see line 41):
[root@localhost ~]$ systemctl restart docker [root@localhost ~]$ docker info Containers: 2 Running: 0 Paused: 0 Stopped: 2 Images: 3 Server Version: 17.06.2-ol Storage Driver: btrfs Build Version: Btrfs v4.9.1 Library Version: 102 Logging Driver: json-file Cgroup Driver: cgroupfs Plugins: Volume: local Network: bridge host ipvlan macvlan null overlay Log: awslogs fluentd gcplogs gelf journald json-file logentries splunk syslog Swarm: inactive Runtimes: runc Default Runtime: runc Init Binary: docker-init containerd version: 6e23458c129b551d5c9871e5174f6b1b7f6d1170 runc version: 810190ceaa507aa2727d7ae6f4790c76ec150bd2 init version: 949e6fa Security Options: seccomp Profile: default selinux Kernel Version: 4.1.12-94.3.8.el7uek.x86_64 Operating System: Oracle Linux Server 7.3 OSType: linux Architecture: x86_64 CPUs: 2 Total Memory: 7.795GiB Name: localhost.localdomain ID: GZ5A:XQWB:F5TE:56JE:GR3J:VCXJ:I7BO:EGMY:K52K:JAS3:A7ZC:BOHQ Docker Root Dir: /var/lib/docker Debug Mode (client): false Debug Mode (server): false Registry: https://index.docker.io/v1/ Experimental: true Insecure Registries: 127.0.0.0/8 Live Restore Enabled: false
Now you are all set to use the --squash
option. You can verify this also by doing a docker build --squash .
. If experimental features are not turned on you will receive:
[root@localhost ~]$ docker build --squash . "--squash" is only supported on a Docker daemon with experimental features enabled
If experimental features are turned on on the other hand you will overcome that error and have Docker complain that there is no Dockerfile present to start the build:
[root@localhost ~]$ docker build --squash . unable to prepare context: unable to evaluate symlinks in Dockerfile path: lstat /root/Dockerfile: no such file or directory
Building an Oracle Database Docker image using the –squash option
Without the --squash
option today an Oracle Database Enterprise Edition Docker image is over 13 GB in size:
[oracle@localhost dockerfiles]$ docker images REPOSITORY TAG IMAGE ID CREATED SIZE oracle/database 12.2.0.1-ee 1b5ed2b790a7 20 seconds ago 13.2GB oraclelinux 7-slim a6e9e9e5ddc1 3 weeks ago 118MB
When taking a closer look at that image, there is a lot of spaces wasted due to the layered filesystem that is no longer needed for that image:
[oracle@localhost dockerfiles]$ docker history 1b5ed2b790a7 IMAGE CREATED CREATED BY SIZE COMMENT 1b5ed2b790a7 31 minutes ago /bin/sh -c #(nop) CMD ["/bin/sh" "-c" "ex... 0B 3253c4b724eb 31 minutes ago /bin/sh -c #(nop) EXPOSE 1521/tcp 5500/tcp 0B 3a2ce9bac206 31 minutes ago /bin/sh -c #(nop) VOLUME [/opt/oracle/ora... 0B b7576a4a19e1 31 minutes ago /bin/sh -c #(nop) WORKDIR /home/oracle 0B 4597cb7d8aea 31 minutes ago /bin/sh -c #(nop) USER [oracle] 0B fbf4e1b3363d 31 minutes ago |0 /bin/sh -c $ORACLE_BASE/oraInventory/or... 11.2MB 07acc091e808 31 minutes ago /bin/sh -c #(nop) USER [root] 0B f8b524f5a3f2 31 minutes ago |0 /bin/sh -c $INSTALL_DIR/$INSTALL_DB_BIN... 5.97GB 6bc32fcfcbaf 37 minutes ago /bin/sh -c #(nop) USER [oracle] 0B 75de01f58755 37 minutes ago |0 /bin/sh -c chmod ug+x $INSTALL_DIR/*.sh... 3.62GB 8cbd3c82fbbe 40 minutes ago /bin/sh -c #(nop) COPY multi:9d11dcad11365... 22kB 2aeeab9ba02d 40 minutes ago /bin/sh -c #(nop) COPY multi:10a3dd36c140a... 3.45GB 6c4c8a19d3a7 41 minutes ago /bin/sh -c #(nop) ENV INSTALL_DIR=/opt/or... 0B 1169e05f0f02 41 minutes ago /bin/sh -c #(nop) ENV ORACLE_BASE=/opt/or... 0B 7f19c2d035db 41 minutes ago /bin/sh -c #(nop) MAINTAINER Gerald Venzl... 0B a6e9e9e5ddc1 3 weeks ago /bin/sh -c #(nop) CMD ["/bin/bash"] 0B 3 weeks ago /bin/sh -c #(nop) ADD file:7f7d89f7bdf05fd... 118MB 5 weeks ago /bin/sh -c #(nop) MAINTAINER Oracle Linux... 0B
The layers in line 12 and 14 is space occupied by copying the installer files into the image, unzip them and then run the installation. The layer in line 8 also occupies unnecessary space as all that this step did is setting permissions on some files.
With the support for squashing filesystem layers into a single one, the Oracle Database Docker image build script also got enhanced to allow passing on options such as --squash
to the Docker daemon via -o
. In order to build a squashed Docker image for the Oracle database all one has to do is to append -o --squash
to the build script. See Creating an Oracle Database Docker image for more details on the image build itself!
[oracle@localhost dockerfiles]$ ./buildDockerImage.sh -e -o --squash Checking if required packages are present and valid... linuxx64_12201_database.zip: OK ========================== DOCKER info: Containers: 0 Running: 0 Paused: 0 Stopped: 0 Images: 16 Server Version: 17.06.2-ol Storage Driver: btrfs Build Version: Btrfs v4.9.1 Library Version: 102 Logging Driver: json-file Cgroup Driver: cgroupfs Plugins: Volume: local Network: bridge host ipvlan macvlan null overlay Log: awslogs fluentd gcplogs gelf journald json-file logentries splunk syslog Swarm: inactive Runtimes: runc Default Runtime: runc Init Binary: docker-init containerd version: 6e23458c129b551d5c9871e5174f6b1b7f6d1170 runc version: 810190ceaa507aa2727d7ae6f4790c76ec150bd2 init version: 949e6fa Security Options: seccomp Profile: default selinux Kernel Version: 4.1.12-94.3.8.el7uek.x86_64 Operating System: Oracle Linux Server 7.3 OSType: linux Architecture: x86_64 CPUs: 2 Total Memory: 7.795GiB Name: localhost.localdomain ID: GZ5A:XQWB:F5TE:56JE:GR3J:VCXJ:I7BO:EGMY:K52K:JAS3:A7ZC:BOHQ Docker Root Dir: /var/lib/docker Debug Mode (client): false Debug Mode (server): false Registry: https://index.docker.io/v1/ Experimental: true Insecure Registries: 127.0.0.0/8 Live Restore Enabled: false ========================== Proxy settings were found and will be used during the build. Building image 'oracle/database:12.2.0.1-ee' ... Sending build context to Docker daemon 3.454GB Step 1/16 : FROM oraclelinux:7-slim ---> a6e9e9e5ddc1 Step 2/16 : MAINTAINER Gerald Venzl ---> Running in 731e99b1252f ---> 3e5ee1e8dcd5 Removing intermediate container 731e99b1252f Step 3/16 : ENV ORACLE_BASE /opt/oracle ORACLE_HOME /opt/oracle/product/12.2.0.1/dbhome_1 INSTALL_FILE_1 "linuxx64_12201_database.zip" INSTALL_RSP "db_inst.rsp" CONFIG_RSP "dbca.rsp.tmpl" PWD_FILE "setPassword.sh" RUN_FILE "runOracle.sh" START_FILE "startDB.sh" CREATE_DB_FILE "createDB.sh" SETUP_LINUX_FILE "setupLinuxEnv.sh" CHECK_SPACE_FILE "checkSpace.sh" CHECK_DB_FILE "checkDBStatus.sh" USER_SCRIPTS_FILE "runUserScripts.sh" INSTALL_DB_BINARIES_FILE "installDBBinaries.sh" ---> Running in 55ec8a41dcf5 ---> 89d92f9f8c18 Removing intermediate container 55ec8a41dcf5 Step 4/16 : ENV INSTALL_DIR $ORACLE_BASE/install PATH $ORACLE_HOME/bin:$ORACLE_HOME/OPatch/:/usr/sbin:$PATH LD_LIBRARY_PATH $ORACLE_HOME/lib:/usr/lib CLASSPATH $ORACLE_HOME/jlib:$ORACLE_HOME/rdbms/jlib ---> Running in ba918281f786 ---> 71abe664d3eb Removing intermediate container ba918281f786 Step 5/16 : COPY $INSTALL_FILE_1 $INSTALL_RSP $SETUP_LINUX_FILE $CHECK_SPACE_FILE $INSTALL_DB_BINARIES_FILE $INSTALL_DIR/ ---> ee1bfa84da00 Removing intermediate container 8859d54f9f64 Step 6/16 : COPY $RUN_FILE $START_FILE $CREATE_DB_FILE $CONFIG_RSP $PWD_FILE $CHECK_DB_FILE $USER_SCRIPTS_FILE $ORACLE_BASE/ ---> 617a3fef9ab7 Removing intermediate container 7e87f0a3842a Step 7/16 : RUN chmod ug+x $INSTALL_DIR/*.sh && sync && $INSTALL_DIR/$CHECK_SPACE_FILE && $INSTALL_DIR/$SETUP_LINUX_FILE ---> Running in dbf12b6b3eb2 ... ... ... ... ... Removing intermediate container 44e7c6d85022 Step 10/16 : USER root ---> Running in 4de8eb80cf7c ---> 3024886cc1e9 Removing intermediate container 4de8eb80cf7c Step 11/16 : RUN $ORACLE_BASE/oraInventory/orainstRoot.sh && $ORACLE_HOME/root.sh && rm -rf $INSTALL_DIR ---> Running in 778aa6699b31 Changing permissions of /opt/oracle/oraInventory. Adding read,write permissions for group. Removing read,write,execute permissions for world. Changing groupname of /opt/oracle/oraInventory to dba. The execution of the script is complete. Check /opt/oracle/product/12.2.0.1/dbhome_1/install/root_778aa6699b31_2017-10-24_22-16-14-688141805.log for the output of root script ---> c957e036f82e Removing intermediate container 778aa6699b31 Step 12/16 : USER oracle ---> Running in feca45328a95 ---> 324e1973d1af Removing intermediate container feca45328a95 Step 13/16 : WORKDIR /home/oracle ---> a47ca371ba2d Removing intermediate container 48750bbc31bb Step 14/16 : VOLUME $ORACLE_BASE/oradata ---> Running in 14f2be4454fb ---> 1147c64ecb83 Removing intermediate container 14f2be4454fb Step 15/16 : EXPOSE 1521 5500 ---> Running in d4936aba18df ---> 6424494c777a Removing intermediate container d4936aba18df Step 16/16 : CMD exec $ORACLE_BASE/$RUN_FILE ---> Running in 502a55257d11 ---> dee493e2ff89 Removing intermediate container 502a55257d11 Successfully built 758fd037bf24 Successfully tagged oracle/database:12.2.0.1-ee Oracle Database Docker Image for 'ee' version 12.2.0.1 is ready to be extended: --> oracle/database:12.2.0.1-ee Build completed in 780 seconds.
You might spot that the image build itself now takes longer. This is because of how the --squash
feature works. First it builds the regular image and then Docker creates a new image that has all filesystem layers squashed together. The result is two images, a squashed one that is tagged accordingly and an untagged image that was the original image. Note that this also means that in order to use the --squash
feature you have to have additional space available on your image build system!
[oracle@localhost dockerfiles]$ docker images REPOSITORY TAG IMAGE ID CREATED SIZE oracle/database 12.2.0.1-ee 758fd037bf24 12 minutes ago 6.26GB dee493e2ff89 14 minutes ago 13.2GB oraclelinux 7-slim a6e9e9e5ddc1 3 weeks ago 118MB
The untagged image has exactly the same size as before, 13.2 GB, and it is in all aspects the very same as the original image that you had before, with all the layers and space within those layers occupied. The second one however, is the squashed image which only is 6.26 GB in size, which is the actual space occupied by all the files within the image. This is actually quite amazing. Just by using the --squash
feature the image size has become half the size of the original one! Of course, your mileage may vary per image, depending how much space is wasted within your image to begin with.
Looking at the new image history you will spot something different than before:
[oracle@localhost dockerfiles]$ docker history oracle/database:12.2.0.1-ee IMAGE CREATED CREATED BY SIZE COMMENT 758fd037bf24 14 minutes ago 6.14GB merge sha256:dee493e2ff89ac1317be5eeed80dfae6f9140be968a1a866c14b0998cbec2d95 to sha256:a6e9e9e5ddc1bdf50d64027c18cd27918c439797925ff16c97feaeee49b22ac2 16 minutes ago /bin/sh -c #(nop) CMD ["/bin/sh" "-c" "ex... 0B 16 minutes ago /bin/sh -c #(nop) EXPOSE 1521/tcp 5500/tcp 0B 16 minutes ago /bin/sh -c #(nop) VOLUME [/opt/oracle/ora... 0B 16 minutes ago /bin/sh -c #(nop) WORKDIR /home/oracle 0B 16 minutes ago /bin/sh -c #(nop) USER [oracle] 0B 16 minutes ago |0 /bin/sh -c $ORACLE_BASE/oraInventory/or... 0B 16 minutes ago /bin/sh -c #(nop) USER [root] 0B 16 minutes ago |0 /bin/sh -c $INSTALL_DIR/$INSTALL_DB_BIN... 0B 22 minutes ago /bin/sh -c #(nop) USER [oracle] 0B 22 minutes ago |0 /bin/sh -c chmod ug+x $INSTALL_DIR/*.sh... 0B 25 minutes ago /bin/sh -c #(nop) COPY multi:9d11dcad11365... 0B 25 minutes ago /bin/sh -c #(nop) COPY multi:10a3dd36c140a... 0B 26 minutes ago /bin/sh -c #(nop) ENV INSTALL_DIR=/opt/or... 0B 26 minutes ago /bin/sh -c #(nop) ENV ORACLE_BASE=/opt/or... 0B 26 minutes ago /bin/sh -c #(nop) MAINTAINER Gerald Venzl... 0B 3 weeks ago /bin/sh -c #(nop) CMD ["/bin/bash"] 0B 3 weeks ago /bin/sh -c #(nop) ADD file:7f7d89f7bdf05fd... 118MB
There are only two layers that remain that occupy space. The first one (line 20) is the base image of 118 MB in size (the image used in the FROM
instruction) and the second one (line 3) is the installed Oracle database with 6.14 GB. Note the comment next to it that says “merge sha256:…“, indicating that this layer is a merged layer of all the layers between the two hashes shown, essentially everything between line 4 and line 18.
As the last step, you can happily go ahead now and delete the untagged image as you won’t need it anymore:
[oracle@localhost dockerfiles]$ docker images REPOSITORY TAG IMAGE ID CREATED SIZE oracle/database 12.2.0.1-ee 758fd037bf24 2 hours ago 6.26GB dee493e2ff89 2 hours ago 13.2GB oraclelinux 7-slim a6e9e9e5ddc1 3 weeks ago 118MB [oracle@localhost dockerfiles]$ docker rmi dee493e2ff89 Deleted: sha256:dee493e2ff89ac1317be5eeed80dfae6f9140be968a1a866c14b0998cbec2d95 Deleted: sha256:6424494c777a0b816cebc1255956618398b04d9fb170d23be6cd515832bfdbd9 Deleted: sha256:1147c64ecb83ad9d5ba16c7aec6aad1f967119e7bf64b7efd73bd61d4dd4bca9 Deleted: sha256:a47ca371ba2da6738dec94a7cb9ea5eb06e6d28d86e84ed490bc3a5ad3a047e1 Deleted: sha256:324e1973d1afa1ec595f080c8dfbe5e5744f629a77e04f1bfaf49419a55560de Deleted: sha256:c957e036f82e969eeed2c10706809263218a571c4a98f42a2173fa9a9da1314c Deleted: sha256:c0a4621f29f2dd5504648ab6a7d0714bb2c3978931a13dc84bb409653d4a1dfc Deleted: sha256:3024886cc1e99900e3e98b4f400f5b196a572ddf0c108747cabc21306126917a Deleted: sha256:25d34b0e9f1f06231d3618a64d71d628814051e3fa6dd6efe50df5cc86c7e561 Deleted: sha256:f96299b83d8e22833550edc5e429d996c77a2955713d0398d499fa470a3d785b Deleted: sha256:7d882e4cc07ee35d929ef6b674d097dd4781daa19c4492680248e7210bdd84a6 Deleted: sha256:f032bb97ce431a38d6c8e5a9e9e747646f9b17f338a21d2ba3f96bf5ce625e69 Deleted: sha256:58b0605016d39083971e8184dced3aa9a8b0d6399f05fe6a0c8371354f7fc371 Deleted: sha256:617a3fef9ab7df8414317fc60f6fae1bdb396816429431e9e1537cf3979947a7 Deleted: sha256:cc1756dd643713e8d723b4c52ff92ebb6873b94a64d5ae1f84e39f08d7e0284c Deleted: sha256:ee1bfa84da00e49c8117f3abc5e932003d224c921ae2311126a084188f2934c6 Deleted: sha256:09ac53e179b7c9924d79fd7483570230d18742a1355c44240ab4c08768a2dd2a Deleted: sha256:71abe664d3eba13768568804098740dfe369346716f5cea90ee60bfdba2ca305 Deleted: sha256:89d92f9f8c18a13ea1b89b5d0856618c200d82ac7cca5002f02c1bd23059a91f Deleted: sha256:3e5ee1e8dcd592e836ac629ed61bb0bf2dc616bef5d5162f098216539ce00657 [oracle@localhost dockerfiles]$ docker images REPOSITORY TAG IMAGE ID CREATED SIZE oracle/database 12.2.0.1-ee 758fd037bf24 2 hours ago 6.26GB oraclelinux 7-slim a6e9e9e5ddc1 3 weeks ago 118MB
Conclusion
There are a couple of different ways that allow you to shrink the size of your Docker images. All of them have pros and cons. The experimental --squash
feature makes it easy to just deflate your image to the actual size required for the files within the image by squashing all layers into a single one. However, keep in mind that this feature is still in experimental state and might disappear in future.
Good article. I’ve just a comment about the squash option. Even if in a single image your preserving space, it won’t be the case globally in your Docker hosts (when you have multiple images). When images share a layer, this one is present only 1 time on your file system. With squash the layer will be present in each squashed image.
So it hinkle we have to take care using it.
The approach I’m using on my side is to prevent the chown operation using the chown on the copy command. (you can find information on my blog if you want)
LikeLike
That is correct, if you share layers across multiple images the squash option will prevent you from doing so. So you might not want to use squash in that scenario.
Alternatively, you can also build another image with that layer as a result and build other images on top of that new base image.
LikeLike
If you want something a bit more extreme with upto 30x size reduction you can try DockerSlim ( http://dockersl.im ). It works best for simple Docker images where you have only one app. Give it a try and if it doesn’t work ping me… I’d love to find out why 🙂
LikeLike