Taking Control of Your Docker Image
In the previous post I walked through creating a full set of Docker images to deploy Jenkins in containers. We don’t deploy our production Jenkins server in containers just yet, although we fully intend to. The real value my team has gained from this work is the ability to quickly spin up Jenkins test environments. These environments can be used to test plugins in isolation or create reproduction cases for problems we’re encountering. For example, containers make it easier to test the impact of a Jenkins upgrade on a particular configuration. I can persist the data with a data volume and upgrade my Jenkins master image without fear.
These examples are also one of the primary ways I taught myself more complicated Docker options. How would Docker work in a real world scenario? How do I solve for persistence? A year ago, I had trouble finding good documentation. I used the work discussed in these posts as a foundation to explore other possibilities, such as using Docker containers as build slaves. This agility allows my team to develop tools that give Riot engineers the freedom to own their build pipelines and build environments.
Back when I started this adventure a year ago, there also wasn’t a very good Cloudbees Jenkins image like there is today. I was hand-creating my jenkins-master image based on how I deployed Jenkins on our production servers. Which leads me to this post: it’s time to break down the Cloudbees image to really understand how it works. When I did this, I learned a lot about Jenkins: by breaking down the image and taking control of it in my own Dockerfile I can make changes and eliminate dependencies I personally don’t want to have.
Managing your dependencies is somewhat subjective. For myself, working at Riot, I like to minimize my reliance on public dependencies. In Docker terms this is all about knowing where the “FROM” clause in the Dockerfile points and what it’s retrieving. If you want to know where your images come from and what’s in them, this post will be very relevant to your interests. Likewise if you want to change the default OS, version of Java in use, or remove some of the fancier features of the Cloudbees container. If you’re happy with the “it just works” aspects of what’s been done so far, this post may be less relevant.
Controlling your own image does have these benefits:
- Control of the default OS layer of the image. If a Dockerfile relies on a chain of FROM clauses, which ever one was first controls the OS. So knowing everything that goes into the image being used is necessary to change it.
- Every image used in the inheritance chain may come from a public source and could potentially be changed without warning and may contain something unwanted. There is of course a security risk but to me it’s also just about not allowing things to change without warning.
The first step is paying attention to what is in the dependency list for the Dockerfile we have. So far, I’ve used the public Jenkins CI Dockerfile for all of the tutorials. So let’s start there and see what it uses.
We first need to find the Dockerfile that defines the image we’re using. Dockerhub makes this fairly painless and we’ll be using Dockerhub to lead us to all the Dockerfiles of the images we seek, starting with the Jenkins image. To figure out which image we’re using, all that’s needed is to take a look at the jenkins-master Dockerfile we previously created.
FROM jenkins:1.609.1 MAINTAINER Maxfield Stewart # Prep Jenkins Directories USER root RUN mkdir /var/log/jenkins RUN mkdir /var/cache/jenkins RUN chown -R jenkins:jenkins /var/log/jenkins RUN chown -R jenkins:jenkins /var/cache/jenkins USER jenkins # Set list of plugins to download / update in plugins.txt like this # pluginID:version # credentials:1.18 # maven-plugin:2.7.1 # ... # NOTE : Just set pluginID to download latest version of plugin. # NOTE : All plugins need to be listed as there is no transitive dependency resolution. COPY plugins.txt /usr/share/jenkins/plugins.txt RUN /usr/local/bin/plugins.sh /usr/share/jenkins/plugins.txt # Set Defaults ENV JAVA_OPTS="-Xmx8192m" ENV JENKINS_OPTS="--handlerCountStartup=100 --handlerCountMax=300 --logfile=/var/log/jenkins/jenkins.log --webroot=/var/cache/jenkins/war"
We can see that the FROM clause points to jenkins:1.609.1. In Dockerfile terms, that means the image named “jenkins” tagged with “1.609.1,” which happens to be the Jenkins version. Let’s go hunt that down on Dockerhub.
- Go to: http://hub.docker.com.
- Dockerhub is super useful for sharing images publicly, and if you want you can register an account—but this tutorial doesn’t require it.
- In the search window enter the image name, in this case: “jenkins.”
- A list of image repositories comes back, click on “jenkins” at the top.
- You should now see a detailed description of the image. Notice the section titled "Supported tags and respective Dockerfile links" at the top of the Full Description—all images on Dockerhub contain this section.
- The Jenkins image offers one link to a Dockerfile for 1.625.2. So it looks like since I started this tutorial, they updated their version. Click the link for the Dockerfile (here for reference).
- Following the link takes you straight to a Github page with the Dockerfile details, which is what we’re after.
Our goal is to replicate this Dockerfile but own the dependencies, so save the text of this file. We’ll put together our new Dockerfile towards the end of this tutorial after we’ve got a full list of all of the dependencies. For the record, the current Jenkins Dockerfile is:
FROM java:8-jdk RUN apt-get update && apt-get install -y wget git curl zip && rm -rf /var/lib/apt/lists/* ENV JENKINS_HOME /var/jenkins_home ENV JENKINS_SLAVE_AGENT_PORT 50000 # Jenkins is run with user `jenkins`, uid = 1000 # If you bind mount a volume from the host or a data container, # ensure you use the same uid RUN useradd -d "$JENKINS_HOME" -u 1000 -m -s /bin/bash jenkins # Jenkins home directory is a volume, so configuration and build history # can be persisted and survive image upgrades VOLUME /var/jenkins_home # `/usr/share/jenkins/ref/` contains all reference configuration we want # to set on a fresh new installation. Use it to bundle additional plugins # or config file with your custom jenkins Docker image. RUN mkdir -p /usr/share/jenkins/ref/init.groovy.d ENV TINI_SHA 066ad710107dc7ee05d3aa6e4974f01dc98f3888 # Use tini as subreaper in Docker container to adopt zombie processes RUN curl -fL https://github.com/krallin/tini/releases/download/v0.5.0/tini-static -o /bin/tini && chmod +x /bin/tini \ && echo "$TINI_SHA /bin/tini" | sha1sum -c - COPY init.groovy /usr/share/jenkins/ref/init.groovy.d/tcp-slave-agent-port.groovy ENV JENKINS_VERSION 1.625.2 ENV JENKINS_SHA 395fe6975cf75d93d9fafdafe96d9aab1996233b # could use ADD but this one does not check Last-Modified header # see https://github.com/docker/docker/issues/8331 RUN curl -fL http://mirrors.jenkins-ci.org/war-stable/$JENKINS_VERSION/jenkins.war -o /usr/share/jenkins/jenkins.war \ && echo "$JENKINS_SHA /usr/share/jenkins/jenkins.war" | sha1sum -c - ENV JENKINS_UC https://updates.jenkins-ci.org RUN chown -R jenkins "$JENKINS_HOME" /usr/share/jenkins/ref # for main web interface: EXPOSE 8080 # will be used by attached slave agents: EXPOSE 50000 ENV COPY_REFERENCE_FILE_LOG $JENKINS_HOME/copy_reference_file.log USER jenkins COPY jenkins.sh /usr/local/bin/jenkins.sh ENTRYPOINT ["/bin/tini", "--", "/usr/local/bin/jenkins.sh"] # from a derived Dockerfile, can use `RUN plugins.sh active.txt` to setup /usr/share/jenkins/ref/plugins from a support bundle COPY plugins.sh /usr/local/bin/plugins.sh
The most important thing to note is that Jenkins uses “FROM java:8-jdk” which will be the next Dockerfile we need to hunt down.
Before we do that, however, we should understand everything in this file as we’ll want to replicate it in our own Dockerfile. There’s a lot in here: Cloudbees has put a good amount of work into making a solid Docker image. The highlights to pay attention to are as follows:
- Environment variables are set up for JENKINS_HOME, JENKINS_SLAVE_PORT, TINI_SHA, JENKINS_UC, JENKINS_VERSION and COPY_REFERNCE_FILE_LOG.
- The image uses Tini to help manage any zombie processes, which is an interesting addition we’ll probably want to keep seeing as Cloudbees feels it’s necessary.
- The Jenkins war file is pulled into the image by the Dockerfile with a curl request.
- The file itself installs wget, curl, and git using “apt-get,” which lets us know the OS is some form of Debian/Ubuntu flavor of Linux.
- Three files are copied into the container from source: jenkins.sh, plugins.sh, and init.groovy. We’ll need versions of these in our own image if we want to share this behavior.
- A couple of ports are exposed, 8080 and 50000, for Jenkins to listen on and Slaves to talk to Jenkins respectively.
That’s a lot to take in and a lot to maintain. This is the first opportunity to realize that managing our own Dockerfile will take some work and one of the things you have to consider when deciding to take full control.
With the Jenkins Dockerfile set aside, we need to repeat the process for every FROM clause we find until we get to the base operating system. That means searching Dockerhub again for the next image: in this case, java:8-jdk.
- Enter “java” into the Dockerhub search window (make sure you’re at the Dockerhub main page and not just searching the Jenkins repository).
- Like with the Jenkins search, “java” comes back as the first repository. Click on it.
- Under the “Supported tags” section we can see Java has a lot of different tags and images. Find the row that mentions the tag we’re after, “8-jdk,” and follow the link to its Dockerfile.
This is an interesting image. It’s the publicly available java 8-jdk image that itself, in its FROM clause, references yet another public image, buildpack-deps:jessie. So we’re going to have to hunt down another image, but we’re not done finding out what’s in the image we have.
This image does a few things we need to pay attention to:
- Installs Unzip.
- Uses apt-get to install opendjdk-8.
- Installs ca_certificates which appear to be required.
- Sets up some debian:jessie backports which helps confirm that this image uses Debian:jessie.
For the record here’s the entire Dockerfile:
FROM buildpack-deps:jessie-scm # A few problems with compiling Java from source: # 1. Oracle. Licensing prevents us from redistributing the official JDK. # 2. Compiling OpenJDK also requires the JDK to be installed, and it gets # really hairy. RUN apt-get update && apt-get install -y unzip && rm -rf /var/lib/apt/lists/* RUN echo 'deb http://httpredir.debian.org/debian jessie-backports main' > /etc/apt/sources.list.d/jessie-backports.list # Default to UTF-8 file.encoding ENV LANG C.UTF-8 ENV JAVA_VERSION 8u66 ENV JAVA_DEBIAN_VERSION 8u66-b17-1~bpo8+1 # see https://bugs.debian.org/775775 # and https://github.com/docker-library/java/issues/19#issuecomment-70546872 ENV CA_CERTIFICATES_JAVA_VERSION 20140324 RUN set -x \ && apt-get update \ && apt-get install -y \ openjdk-8-jdk="$JAVA_DEBIAN_VERSION" \ ca-certificates-java="$CA_CERTIFICATES_JAVA_VERSION" \ && rm -rf /var/lib/apt/lists/* # see CA_CERTIFICATES_JAVA_VERSION notes above RUN /var/lib/dpkg/info/ca-certificates-java.postinst configure # If you're reading this and have any feedback on how this image could be # improved, please open an issue or a pull request so we can discuss it!
We will need to replicate everything here. For now, let’s go find the next Dockerfile, which is buildpack-deps:jessie-scm. To do that we repeat the process we’ve already followed:
- Search for 'buildpack-deps' on the main Dockerhub page and select the first result.
- jessie-scm is the second entry on the “Supported Tags” list. Click the link to find its Dockerfile.
This Dockerfile is short and sweet. We can see yet another Dockerfile in the dependency chain called buildpack-deps:jessie-curl. But other than that this Dockerfile just installs five things.
This makes sense as it’s an SCM image. This is one of the first opportunities to weigh whether or not you want to replicate this particular behavior. First, the Cloudbees Jenkins image already installs Git. If you don’t need or use bazaar, mercurial, or subversion then you probably don’t need to install them and you can save some space in your image. For completeness, here’s the entire Dockerfile:
FROM buildpack-deps:jessie-curl RUN apt-get update && apt-get install -y --no-install-recommends \ bzr \ git \ mercurial \ openssh-client \ subversion \ && rm -rf /var/lib/apt/lists/*
Let’s move on to the next dependency in the list. Back to the main Dockerhub search page.
- Search for ‘buildpack-deps’ and follow the first result.
- Follow the first link which is jessie-curl.
Looking at this image we’ve finally found the last dependency. This image has a FROM clause to “debian:jessie” which is the OS. We can see that this image has a simple purpose to install a few more apps:
This is interesting because our other images already install all of these items. We really don’t need this image in the dependency tree as it adds no value.
We’ve now completed the dependency crawl for the Jenkins base image. We’ve found some things we need to pay attention to and copy, and we’ve also found some things we just don’t need and can throw away when making our own complete Dockerfile. For the record here’s the complete dependency chain:
Don’t forget: we wrapped the Jenkins image with our own Dockerfile in the previous tutorials so we need to remember that for the next step, which is making our own Dockerfile.
Making Our Own Dockerfile
With all the research done on the dependencies we can now construct our own Dockerfile. The easiest way would be to just cut and paste everything together and skip the FROM clauses. That would work, but would also create some redundant commands and cruft. We can optimize the size of the image by removing some things we probably don’t need.
We confirmed that the entire image chain is built on top of “Debian:Jessie,” and for this tutorial I’ll walk through taking control of that setup. At the end, I'll provide a link to an alternative built on top of CentOS7 which I prefer due to a strong familiarity with that OS thanks to all the work we do with it at Riot. Either OS is just fine and the power of Docker is that you can choose whatever you want for your containers.
So let’s begin making a completely updated Jenkins-master image. For the record, here’s the Dockerfile for jenkins-master we have so far (if you’ve followed all the tutorials):
FROM jenkins:1.609.1 MAINTAINER Maxfield Stewart # Prep Jenkins Directories USER root RUN mkdir /var/log/jenkins RUN mkdir /var/cache/jenkins RUN chown -R jenkins:jenkins /var/log/jenkins RUN chown -R jenkins:jenkins /var/cache/jenkins USER jenkins # Set Defaults ENV JAVA_OPTS="-Xmx8192m" ENV JENKINS_OPTS="--handlerCountStartup=100 --handlerCountMax=300 --logfile=/var/log/jenkins/jenkins.log --webroot=/var/cache/jenkins/war"
Noteworthy here are that we’re going to move from Jenkins 1.609.1 to 1.625.1, and that I’ll have some updated settings for the JAVA_OPTS based on some things we’ve learned in the past few weeks.
Step one: let’s change the FROM clause to Debian:
- Open the jenkins-master/Dockerfile in your favorite editor.
- Replace the from clause with: FROM debian:jessie.
Next step, we should get all of our application installs with apt-get squared away. Let’s make a new section in the Dockerfile and add the following:
RUN echo 'deb http://httpredir.debian.org/debian jessie-backports main' > /etc/apt/sources.list.d/jessie-backports.list ENV LANG C.UTF-8 ENV JAVA_VERSION 8u66 ENV JAVA_DEBIAN_VERSION 8u66-b17-1~bpo8+1 # see https://bugs.debian.org/775775 # and https://github.com/docker-library/java/issues/19#issuecomment-70546872 ENV CA_CERTIFICATES_JAVA_VERSION 20140324 RUN apt-get update \ && apt-get install -y --no-install-recommends \ wget \ curl \ ca-certificates \ zip \ openssh-client \ unzip \ openjdk-8-jdk="$JAVA_DEBIAN_VERSION" \ ca-certificates-java="$CA_CERTIFICATES_JAVA_VERSION" \ && rm -rf /var/lib/apt/lists/* RUN /var/lib/dpkg/info/ca-certificates-java.postinst configure
That’s a lot of stuff! You’ll note I consolidated the apt-get installs from all the Dockerfiles we looked at into this one set. To do this I had to set all the necessary environment variables used for Java versions and certificates first. I’d recommend testing that everything installs before continuing to add more stuff to the Dockerfile.
docker build jenkins-master/
We’re just testing that everything installs fine, so this image can be a throwaway. You will probably get an error about a missing Jenkins user—that’s okay. Because we changed the base image to the Debian OS, we removed (for now) the Jenkins image which was creating that user.
With the installs out of the way we’ve basically absorbed the Buildpack images and the Java image. So all that’s left is adopting the Jenkins image into our master image.
There’s a lot to include so I’ll go through the things to add after doing the installs section by section. First up is installing Tini. (You can find more information on what Tini is by checking out its Github.) Cloudbees appears to recommend this as a way to help manage subprocesses in a Jenkins container so we’ll keep it and make it the next thing we install in our Dockerfile:
# Install Tini ENV TINI_SHA 066ad710107dc7ee05d3aa6e4974f01dc98f3888 # Use tini as subreaper in Docker container to adopt zombie processes RUN curl -fL https://github.com/krallin/tini/releases/download/v0.5.0/tini-static -o /bin/tini && chmod +x /bin/tini \ && echo "$TINI_SHA /bin/tini" | sha1sum -c -
After installing Tini I decided to lump all the extra environment variables we need into one block as follows:
# SET Jenkins Environment Variables ENV JENKINS_HOME /var/jenkins_home ENV JENKINS_SLAVE_AGENT_PORT 50000 ENV JENKINS_VERSION 1.625.2 ENV JENKINS_SHA 395fe6975cf75d93d9fafdafe96d9aab1996233b ENV JENKINS_UC https://updates.jenkins-ci.org ENV COPY_REFERENCE_FILE_LOG $JENKINS_HOME/copy_reference_file.log ENV JAVA_OPTS="-Xmx8192m" ENV JENKINS_OPTS="--handlerCountStartup=100 --handlerCountMax=300 --logfile=/var/log/jenkins/jenkins.log --webroot=/var/cache/jenkins/war"
In here the two environment variables we had in our original wrapper image are added, JAVA_OPTS and JENKINS_OPTS. But I’ve also included all the environment variables Cloudbees creates to install Jenkins and set up some other options.
Next I placed the three entries we appear to need to get Jenkins installed. That’s creating the Jenkins user itself, creating a volume mount and setting up the init directory.
# Jenkins is run with user `jenkins`, uid = 1000 # If you bind mount a volume from the host or a data container, # ensure you use the same uid RUN useradd -d "$JENKINS_HOME" -u 1000 -m -s /bin/bash jenkins # Jenkins home directory is a volume, so configuration and build history # can be persisted and survive image upgrades VOLUME /var/jenkins_home # `/usr/share/jenkins/ref/` contains all reference configuration we want # to set on a fresh new installation. Use it to bundle additional plugins # or config file with your custom jenkins Docker image. RUN mkdir -p /usr/share/jenkins/ref/init.groovy.d
With these in place we can run the CURL command to grab the right jenkins.war file. Note that this uses the JENKINS_VERSION environment variable, so if you want to change this in the future, change the environment variable.
# Install Jenkins RUN curl -fL http://mirrors.jenkins-ci.org/war-stable/$JENKINS_VERSION/jenkins.war -o /usr/share/jenkins/jenkins.war \ && echo "$JENKINS_SHA /usr/share/jenkins/jenkins.war" | sha1sum -c -
Next I chose to do all the directory and user permissions. These are carried over from the jenkins-master image we created in previous tutorials, and we still want these to help better isolate our Jenkins installation and to be compatible with our data volume container. The one we bring in from the Cloudbees image is the jenkins/ref directory.
# Prep Jenkins Directories RUN chown -R jenkins "$JENKINS_HOME" /usr/share/jenkins/ref RUN mkdir /var/log/jenkins RUN mkdir /var/cache/jenkins RUN chown -R jenkins:jenkins /var/log/jenkins RUN chown -R jenkins:jenkins /var/cache/jenkins
Next up I go ahead and expose the ports we need:
# Expose Ports for web and slave agents EXPOSE 8080 EXPOSE 50000
All that’s left is to copy in the utility files that Cloudbees has in their image, set the Jenkins user, and run startup commands. I left the “COPY” entries until now per some good Dockerfile best practices. These are likely to change outside the Dockerfile and if they do change we wouldn’t want to necessarily invalidate the entire file cache. So here they are:
# Copy in local config files COPY init.groovy /usr/share/jenkins/ref/init.groovy.d/tcp-slave-agent-port.groovy COPY jenkins.sh /usr/local/bin/jenkins.sh COPY plugins.sh /usr/local/bin/plugins.sh RUN chmod +x /usr/local/bin/plugins.sh RUN chmod +x /usr/local/bin/jenkins.sh
Note: Until we get copies of these files into our repository, they won’t work and our Dockerfile won’t build. We’ll take care of that in the next part where we test everything. Pay special attention to the fact that I added the “chmod +x” commands, because this guarantees the files being added are executable. For now finish up by setting the Jenkins user and entry points.
# Switch to the jenkins user USER jenkins # Tini as the entry point to manage zombie processes ENTRYPOINT ["/bin/tini", "--", "/usr/local/bin/jenkins.sh"]
You can of course find the entire file in the tutorial on Github. For now let’s test all the changes we just made by building the Dockerfile. Remember that we’re expecting errors when we get to the COPY commands.
docker build jenkins-master/
Everything should work as expected including the errors for the missing shell scripts. So that’s the last thing we need to take care of.
Go to the Cloudbees Jenkins Dockerfile Github repo. We need to get copies of the three files they use:
Download or make copies of these files and put them in the jenkins-master folder right next to the Dockerfile. You technically don’t have to keep these, but before you decide not to use them here’s a quick rundown of what they do:
- init.groovy - This file is run when Jenkins starts, and as a groovy file it runs with context inside the Java WAR that is Jenkins. In this case it’s taking the environment variable set for the Slave agents (50000) and making sure that is the port Jenkins uses. You can do a lot more with this groovy file to guarantee Jenkins starts with the same configuration settings, even on a fresh installation.
- plugins.sh - This is a handy file that can be run to auto-download a list of plugins from a plugins text file. You’ll have to include that file yourself but this is handy. In a future blog post I’ll use this to make sure things like the Docker-plugin are always installed.
- jenkins.sh - This is a shell script that starts Jenkins taking advantage of JAVA_OPTS and JENKINS_OPTS environment variables we set.
All said, Cloudbees provides a useful set of functional scripts so I recommend you keep them. A downside to having our own Dockerfile like this is that if Cloudbees chooses to update these in the future, you won’t get automatic updates. It will pay to keep apprised of any changes Cloudbees makes in case you want to take advantage of them.
With these files in place, our updated Dockerfile for the jenkins-master image is ready. If you’ve followed the tutorial series this far you should have a makefile and docker-compose installed. The next step is to do a final image build and start our new Jenkins application:
docker-compose up -d
- Point your browser to: http://yourdockermachineip.
- Jenkins should start.
From this point forward you now have full control of your Docker image down to the OS choice, which of course is still a public OS image. Building your own OS Docker image from scratch is a bit out of scope for this effort.
If you’re curious, my CentOS7-flavored image that does the same thing is available in the tutorial GitHub repo. Feel free to use it or this default Debian one. The big changes are that CentOS uses yum repositories instead of apt-get ones, so a lot of the early installation stuff is changed.
Taking control of your Docker images isn’t that hard. At a minimum, paying attention to your dependencies and where they come from can help you understand what goes into making your containers work. Additionally, if you do so, you can find some opportunities to make your image lighter weight and save some disk space by removing things you don’t need to install. You also lower your risk of a dependency breaking on you.
On the other hand, you take on significant responsibility—you won’t get automatic updates and you’ll have to follow along with changes to things like the Cloudbees Jenkins image. Whether or not this is a benefit to you depends on your personal development policies.
Regardless of whether or not you choose to own your image, I do recommend whenever you pick up a new Dockerfile you follow the same basic practices here. Find the Docker image it inherits from by using Dockerhub to help you follow the inheritance chain. Be aware of all the Dockerfiles in the chain and what they use. You should always be aware of what your images contain—after all this is stuff running on your network, on your servers. At a minimum it helps you find the base operating system but you can also learn a lot about the ecosystem of Docker images out there and learn some interesting practices, like how Cloudbees uses Tini to manage child processes.
As always, you can find everything we did here on GitHub. There’s been a lot of great dialog on these posts and I’d love it if you leave comments, questions, and observations in the comment thread below!
At this point you should have a fully functional Jenkins master server image set and the basics of your own Jenkins environment. The next series of posts will explore connecting slaves to this server (or any Jenkins server); in particular, we’ll look at container-based build slaves. There are several different ways that this can currently be done and I will attempt to cover several of the ones we’ve experimented with at Riot as well as some rather potent lessons learned!
For more information, check out the rest of this series:
Part I: Thinking Inside the Container
Part II: Putting Jenkins in a Docker Container
Part III: Docker & Jenkins: Data That Persists
Part IV: Jenkins, Docker, Proxies, and Compose
Part V: Taking Control of Your Docker Image (this article)
Part VI: Building with Jenkins Inside an Ephemeral Docker Container
Part VII: Tutorial: Building with Jenkins Inside an Ephemeral Docker Container
Part VIII: DockerCon Talk and the Story So Far