Category Archives: Docker

If you are looking for interesting tips and information about Docker, you might be at the right place.

Containers: Resurrection vs. Reincarnation

This post is to share that I am coining new analogy about containers, cloud and orchestration. Everyone uses Pet vs. Cattle, some use Ant. vs. Elephant. These animal analogies are fine and describe how YOU behave to your machines/containers if something bad happens.

But it's also important to look at things from the other side - what actually happens to the app? How does it feel? How does it perceive world around itself?

Resurrection

In a good old Mode1 world things were simple. Some app died, so you went and resurrected it back to life. Sure, PID was different, but it was the same machine, same environment, same processes around it. Feels like home...

Reincarnation

Then the cloud and container world appeared and people realised they don't want to bring dead things back to life (it might have something to do with all the scary zombie movies, I think). And so in container orchestration you just get rid of things that appear to be dead and then bring new ones to life. And you app is reincarnated instead of resurrected

Resurrection vs. Reincarnation.

Reincarnation is not completely new in IT world - it was already used in MINIX many years ago:). But I am coining this new analogy for containers context. Obviously, it's up to you now to share the wisdom and make sure people know who was the original prophet!

Forget resurrection, reincarnation is a way to go!

 

Kubernetes Persistent Storage Hell

We've started to work on a rather complex application recently with my team at Red Hat. We all agreed it'll be best to use containers, Kubernetes and Vagrant to make our development (and testing) environment easy to setup (and to be cool, obviously).

Our application consists of multiple components where those important for the post are MongoDB and something we can call worker. The reason for MongoDB is clear - we are working with JSONs and need to store them somewhere. Worker takes data, does some works on them and writes to DB. There are multiple types of workers and they need to share some data. We also need to be able to scale (That's why we use containers!) which also requires shared storage. We want both storages to be local path (for Vagrant use case especially).

Sounds easy, right? But it's not. Here is the config objects situation:

kubernetes/worker-volume.yaml
kubernetes/worker-claim.yaml
kubernetes/mongo-volume.yaml
kubernetes/mongo-claim.yaml

The way you work with volumes i Kubernetes is that you define a PersistentVolume object stating capacity, access mode and host path (still talking about local storage). Then you define PersistentVolumeClaim with access mode and capacity. Kubernetes then automagically map these two - i.e. randomly match claim and volume where volume provides correct mode and enough capacity.

You might be able to see the problem now, but if not, here it is: If you have 2 volumes and 2 claims (as we have) there is no way you can be sure which claim will get which volume. You might not care when you first start your app, because the directories you provided for volumes will be probably empty. But what if you restart the app? Or the Vagrant box (and thus the app)? You cannot be sure which volume will be assigned to which claim.

This leads to an inconsistent state where your MongoDB storage can be assigned to your worker storage and vice versa.

I've found 2 related issues on github https://github.com/.../issues/14908 and https://github.com/.../pull/17056 which, if implemented and solved, should fix it. But is there a workaround?

Hell yeah! And it's pretty simple. Instead of defining PersistentVolumeClaim object and using persistentVolumeClaim key in a replication controller, you can use hostPath directly in the RC. This is how the patch looked like:

diff --git a/kubernetes/mongodb-controller.yaml b/kubernetes/mongodb-controller.yaml
index ffdd5f3..9d7bbe2 100644
--- a/kubernetes/mongodb-controller.yaml
+++ b/kubernetes/mongodb-controller.yaml
@@ -23,5 +23,5 @@ spec:
 mountPath: /data/db
 volumes:
 - name: mongo-persistent-storage
- persistentVolumeClaim:
- claimName: myclaim-1
+ hostPath:
+ path: "/media/mongo-data"
diff --git a/kubernetes/worker-controller.yaml b/kubernetes/worker-controller.yaml
index 51181df..f62df47 100644
--- a/kubernetes/worker-controller.yaml
+++ b/kubernetes/worker-controller.yaml
@@ -44,5 +44,6 @@ spec:
 mountPath: /data
 volumes:
 - name: worker-persistent-storage
- persistentVolumeClaim:
- claimName: myclaim-2
+ hostPath:
+ path: "/media/worker-data"

The important bits of Kubernetes config then looks like:

...
   volumeMounts:
     - name: mongo-persistent-storage
       mountPath: /data/db
 volumes:
   - name: mongo-persistent-storage
     hostPath:
       path: "/media/mongo-data"
...

How to (be a) man on Atomic Host

One major thing missing on the Atomic Host are manual pages. Not a terrible thing - you can always google for them, right? But what if you cannot? Then there is the Fedora Tools Docker image. Try this:

-bash-4.3$ alias man="sudo atomic run vpavlin/fedora-tools man"
-bash-4.3$ man systemd

You should see a manual page for systemd. Thinking about it, that's it. Nothing more you need to now about it. Simple:)

 

Running git on Atomic Host with Fedora Tools image

I added the Fedora Tools image to Fedora-Dockerfiles repository, as you might know from my earlier post. I'd like to introduce you to one use case for this image - git.

When I started to work more on Docker images, I started using Atomic Host(s) for testing as they boot faster and are easier to set up than classic installations. Problem was to get data in those VMs running Atomic Host as git was not present. That's where I first really appreciated the tools image.

bash-4.3# yum
bash: yum: command not found
bash-4.3# git
bash: git: command not found
bash-4.3# atomic run fedora/tools
[root@localhost /]# cd /host/home/vagrant/
[root@localhost vagrant]# git clone https://github.com/fedora-cloud/Fedora-Dockerfiles
Cloning into 'Fedora-Dockerfiles'...
remote: Counting objects: 2189, done.
remote: Compressing objects: 100% (9/9), done.
remote: Total 2189 (delta 3), reused 0 (delta 0), pack-reused 2180
Receiving objects: 100% (2189/2189), 915.13 KiB | 901.00 KiB/s, done.
Resolving deltas: 100% (1014/1014), done.
Checking connectivity... done.
[root@localhost vagrant]# exit
bash-4.3# ls
Fedora-Dockerfiles sync

It's simple, right? You can see there is neither yum/dnf, nor git on the host, but still, I was able to clone the repository from Github very easily. The important thing to notice is the path I cd'ed to: /host/home/vagrant. You can see /host prefix there. That's where the host's filesystem is mounted and where I can access it and modify it.

You can review the docker run command for the tools image f.e. with this command:

bash-4.3# docker inspect --format='{{.Config.Labels.RUN}}' vpavlin/fedora-tools
docker run -it --name NAME --privileged --ipc=host --net=host --pid=host -e HOST=/host -e NAME=NAME -e IMAGE=IMAGE -v /run:/run -v /var/log:/var/log -v /etc/localtime:/etc/localtime -v /:/host IMAGE

Obviously, you can do more, not just clone the repo - you can run commit, push, checkout or anything else the same way.

Fedora Tools Docker image

I got this request from my colleagues if there is something like Red Hat Atomic Enterprise Tools container image available for Fedora or CentOS. The answers was no, there isn't, thus I started to work on it. I'd like to tell you what it is and why do I invest my time into it.

First of all, Fedora Tools image is meant to be used mostly on Atomic Host as there is no way to install missing tools with yum or dnf. We could create tons of small images each containing a single tool. But that would a) make it hard for users to find all the tools, b) consume more space then a single image if you decide to use many (all...) of them, c) be hard to maintain.

These 3 reasons lead us to create a single image containing big number of tools important to sysadmins, performance analysts, or just users that need man pages on Atomic Host. This image is pretty big (more than 1 GB), but can be pretty useful.

Current version of the Dockerfile can be found in Fedora-Dockerfiles repository. You can find the list of additional packages (to what's already in a base image) starting on line 13.

The basic information on how to use the Fedora Tools Docker image can be found in README file and I hope to provide more how-to's here soon:).

I've set up an automated build as vpavlin/fedora-tools under my namespace on Docker Hub. To try the image, you can do:

atomic run vpavlin/fedora-tools

Enjoy;-)

Jak jsem skoro zazdil InstallFest 2015

To máte tak, spousta práce, trocha nepozornosti a pozvánka na víc akcí (skoro) najednou. Tenhle koktejl okolností způsobil, že jsem byl poměrně dlouhou dobu přesvědčen, že InstallFest 2015 se koná přístí víkend (tedy 14. a 15. března). To si tak ve čtvrtek večer projíždíte Twitter a najednou zmínka o tom, že placky už jsou připraveny na sobotu. Sobotu? Jako tuhle sobotu? Hmm..

Snímek z 2015-03-07 17:58:04

A fakt že jo! No co, jdete spát a říkáte si: "Slajdy udělám zítra v práci, to bude hned." Jenže v práci furt někdo otravuje, něco chce, takže uděláte prd. Tak prý doma, večer. Jenže to se vypravíte na jídlo a pivo. Teda hned po tom si aspoň ráno koupíte jízdenku;). Tak fajn, slajdy se spáchají ráno před odjezdem. Ráno se vyštracháte z postele, koukáte na prázdnou prezentaci a říkáte si: "Co jsem to těm lidem vlastně chtěl říct, když jsem tu prezentaci posílal?" Něco spatláte a pak strávíte ještě půlku cesty dolaďováním a přemýšlením, co jste to vlastně ráno měli v hlavě.

Takže slidy by byly, co demo? Hmm, jak to znám, na demo nedojde a když, bude jiné v závislosti na dotazech. To snad ani nemá cenu chystat;). A měl jsem rpavdu, nemělo!

Klapka, jedem...

"Dovolte mi, abych vás přivítal na své přednášce. Na úvod se vás chci zeptat...ale co to plácám. Tak já se asi představím, co?" Jak vidíte, začal jsem zkušeně, tedy chci říct zmateně. Ovšem tu otázku jsem položil: "Kdo jste slyšeli o Dockeru před tím, než jste si přečetli název téhle přednášky?" Skoro všichni, fajn. "Kdo jste si ho nainstaloval?" zněla další otázka - asi 4 ruce. Uff, to zase budu plácat kraviny. Tak a poslední dotaz: "Ok, kdo jste používali kontejnery ještě před Dockerem?" Tři ruce, sakra, tak tyhle lidi ignorovat, když se budou ptát..ti jsou určitě chytří a ví toho víc než já!

Jak jsem si myslel na začátku, na pořádné demo nedošlo. Ostatně moje jediné "pořádné" demo je to, co jsme popsal v článku Running services with Docker and systemd. Takže se na něj mrkněte a demo si zkuste sami;) Třeba se vám taky rozbije, jako by se to určitě stalo mně.

Také, jako už tradičně, se přednáška zvrhla na Q&A session, kde jsem dostával záludné otázky a poskytoval jsem na ně v zásadě nesouvisející odpovědi. (Jsem v tom čím dál lepší!) Ale musím říct, že jsem si to s vámi, InstalFesťáci, užil. Hezky jsme pokecali. A navíc jste se mě nikdo nezeptal na síťování, čehož si velice cením!

Upřímně, slajdy samotné vám asi moc neřeknou, ale tady je máte - na konci jsou nějaké odkazy, tak třeba budou užitečné. Přednáška se očividně natáčela, takže jakmile bude, přihodím ještě video. A teď už dobrou noc, jdu si pustit nějaký film, když už si konečně Student Agency obnovilo výběr, a nejspíš si i trochu schrupnu. Ještě jednou díky za účast!

EDIT:

Jak jsem zjistil, video se přímo streamovalo na Youtube, takže tady je záznam přednášky:

My Docker Helpers

I work with Docker almost all the time in my job at Red Hat. Building, running, inspecting containers... Writing same long commands every time you want to run a container or get it's IP starts to annoy you quickly. That's why I started writing small helpers in the form of bash functions which are loaded through .bashrc and thus can be used from cmd line easily.

You can find them in my docker-tools repository but let me introduce them a bit.

docker-rmi-none

If you load/import/build images often, it happens that you end up having a bunch of <none> named images in your docker images output. The above commands removes them all.

docker-rm-all

I use this mostly in VMs where I am limited in terms of disk space - every container, especially when you test f.e. if yum install works, eats some space and this command lets you remove them all quickly.

dr fedora
docker run --name tmp0 -it --rm fedora bash
dr fedora cat /etc/os-release 
docker run --name tmp0 -it --rm fedora cat /etc/os-release
NAME=Fedora
...

This dr command is probably my favourite. It runs bash in the given image with arguments I use the most. You can also specifiy a command to run if you wish so.

dl [PATH_TO_]IMAGE

Simple alias for docker load command with the advantage of being able to load from default directory so you can just give it a file name and it looks in the predefined folder.

de [CONTAINER] [CMD]

The most awesome thing about using functions instead of just aliases is that you can add whatever logic you like. So my de command (representing docker exec) can be called with a container id/name and command  - same as docker exec. But it can also be called without command which then defaults to bash and also without container id/name which default to the last entry in docker ps output. If you want to skip specifying container, but still want to use different cmd than bash, use following syntax:

de "" rpm -qa

I don't use next command as often as those above but I still like it a lot - it let's you print IP address of any container. If container id/name is not specified it uses the same logic as de.

di [CONTAINER]

Last command I have on my list at the moment is dk and you could maybe guess - yes, it's docker kill and it also provides the same logic as the two above.

dk [CONTAINER]

Do you have more aliases/ideas? Let me know I am happy to make my list richer!

Delete an Image from Private Docker Registry

Have you ever wondered how to remove repositories/tags from your private Docker registry? It's simple according to Docker registry API specs. So let's try this

yum -y install docker-registry
sudo systemctl start docker-registry
docker pull fedora:21
docker tag fedora:21 localhost:5000/fedora:21 
curl localhost:5000/v1/repositories/fedora/tags/21

You should see an image id printed to your terminal. Now let's delete the image...

$ curl -X DELETE localhost:5000/v1/repositories/fedora/tags/21
true

To be clear - it does not remove the image/layers data - it just removes the reference from fedora:21 tag to the image id (i.e. data). If there is any other tag referencing the data, they will still be accessible.

Anyway, in some cases it is useful to be able to remove this reference. I run a private registry with 337 images (multiplied by few tags for every image) and I sometimes found myself in a situation where I pushed an image with wrong tag or I just wanted to stop people from pulling a specific image. I wrote a small bash script for these occasions - drrm.sh. The usage is simple

drrm.sh NAME[:TAG]

Which means for our fedora example

$ ./drrm.sh localhost:5000/fedora:21
Do you really want to untag "834629358fe214f210b0ed606fba2c17827d7a46dd74bd3309afc2a103ad0e89"? [y/N]: y
Image library/fedora:21 removed from localhost:5000

Firstly it checks if the image exists, then it asks for confirmation of removal and then it calls the previously shown curl command to delete the reference. I also have a simple "Docker Registry Garbage Collector" under development which goes through the docker-registry directory and moves unreferenced layers away (where you can delete them later). But that's going to be a topic next time:).

Running services with docker and systemd

I have described how to run systemd in a Docker container on Fedora in previous article but didn't give you any "real" example of how to actually use it. I guess you find it easy to figure out your own examples but let me show you mine.

WordPress & MariaDB

I like to use WP and MariaDB as example applications when I give a talk about containers. It's quite simple setup and at the same time uses some nice features of docker (f.e. links or port mapping). If you just want to see my Dockerfiles, please go to this github repository. If want more babbling, read on.

The MariaDB Dockerfile is forked from Fedora-Dockerfiles and I guess it will be "merged" there soon:). First few lines are quite boring - yum update/install. Next line I really like. How to start MariaDB properly..hmm...hey, let's use what packagers came up with - mariadb.service file.

RUN systemctl enable mariadb.service

This really presents the beauty of using systemd inside containers. Single line tells the init that we want to start the service on the container start. Easy, clean, awesome. Then some "stuff" follows. I let it up to you to figure out what it actually does - everybody likes homeworks, right?

You can also check my WordPress Dockerfile in the same repository. It's a bit longer but the most important line for this case is again enablement of the service - in this case httpd.

RUN systemctl enable httpd.service

If you build those two images...

docker build -t vpavlin/mariadb mariadb/
docker build -t vpavlin/wordpress wordpress/

You can run them with these commands

docker run -it --rm -v /sys/fs/cgroup:/sys/fs/cgroup --name mariadb vpavlin/mariadb
docker run -it --rm -v /sys/fs/cgroup:/sys/fs/cgroup -p 80:80 --name wordpress --link mariadb:mariadb vpavlin/wordpress

To describe these commands I'd say: Run a container with stdin and out attached to my tty, volume mount /sys/fs/cgroup for systemd, name containers mariadb and wordpress (respectively) and link mariadb to wordpress (which basically mean tell WP how to connect to mariadb). Oh, and map port 80 of WP container to host's port 80.

When you hit http://localhost in your browser, you should see a WP installation page.

Snímek z 2015-02-24 23:35:53

Any questions?:)

Fedora, docker and systemd

You've heard about systemd, right? The init system wrote (not only) by famous Lennart Poettering which is trying to eat all those nice ancient tools and system parts (just kidding😉 but which also makes live of many system administrator waaaaay easier. You've probably also heard about Docker - the tool which currently leads the world of linux containers (and soon will do the same in Windows world probably). So, where does these 2 projects and programs meet?

Scott Collier asked me to provide him a status where we are with "running systemd in Docker containers" in Fedora and let him know how to actually make it work. If you searched Google before, you've probably found Dan Walsh's article about running systemd inside a container. Some things changed, some (hopefully) will change sooner or later, but I can tell you that it's quite easy now to run services in a Docker container by systemd.

First things first. To be able to run systemd we need few things - cgroup tree, /run and /tmp to be a mountpoint (preferably on tmpfs), environment variable container to be set to "docker", get rid of fstab and mount units, tweak dbus.service a bit and that's it. Some of them are on you, we took care of the rest in Fedora base images.

Cgroups

Well, it's simple - systemd touches /sys/fs/cgroup and expects it to be populated. As kernel won't populate cgroups in container, we need to mount it from the host. Easy, right? Sadly this cannot be done automatically as Docker tries to stay above these distribution specific modifications (which is good..mostly). So you need to add

-v /sys/fs/cgroup:/sys/fs/cgroup:ro

to your docker create/run cmdline.

Temp mounts

We can blame both - systemd and Docker from not being able to solve this for us automatically. We need either that systemd does not require /run and /tmp to be mount points or that Docker provides volumes for them by default. I think I understand both points of view. It's again a distribution specific change for Docker and at the same time it's a sane default for systemd to require to have /tmp and /run really temporary. So how to get around this? Let's add another volume to our image (there is a PR for Docker to do it automatically). Contrary to the cgroup mount, this does not have to lead to any specific location on host. So the command line solution would be

-v /run -v /tmp

or in Dockerfile

VOLUME ["/run", "/tmp"]

Environment variable(s)

There are ways for systemd to figure out where and how it runs. It checks bunch of things and one of them is environment variable $container. It can equal to few things (f.e. lxc) but here we, for obvious reasons, want to have it's value set to docker. So on command line you would need

-e container=docker

or in Dockerfile

ENV container docker

There is another variable systemd can use. It's called $container_uuid and it is used to set the /etc/machine-id. That can be very useful because it for example identifies your container in journald. Wouldn't it be awesome if we could get this set up automatically by Docker daemon when the container is created? There is a (closed) PR on Docker for this.

Mounts

Docker containers drop sysadmin capability which is good for security but bad for systemd. It tries to do some mounting on start up and it expectedly fails. The easiest way of getting rid of these fails is 1) to remove /etc/fstab and 2) to mask mount units which systemd ships (I've found these in a Fedora base image: dev-hugepages.mount, sys-fs-fuse-connections.mount). Both is done in fedora-base-docker.ks in %post section (which is used to build the base image).

Dbus

This again has something to do with capabilities. Dbus service tries to change it's OOMScore in unit file which fails. But this time it fails quite badly - sometimes the container dies completely, sometimes systemd says it's logging to fast and freezes, but in all cases the container is basically useless. It should be fixed in latest systemd builds in Fedora, but I still hit this in fedora:21 image. To solve this for your containers, please add this line to your Dockerfile

RUN cp /usr/lib/systemd/system/dbus.service /etc/systemd/system/; sed -i 's/OOMScoreAdjust=-900//' /etc/systemd/system/dbus.service

Summary

Ok, now I hopefully convinced you that running your services in containers by systemd is easy. sadly, what you need at the moment is to create another layer over Fedora base image. You can do that with this Dockerfile:

FROM fedora
MAINTAINER Vaclav Pavlin <vpavlin@redhat.com>

RUN yum -y update; yum clean all

RUN systemctl mask systemd-remount-fs.service dev-hugepages.mount sys-fs-fuse-connections.mount systemd-logind.service getty.target console-getty.service
RUN cp /usr/lib/systemd/system/dbus.service /etc/systemd/system/; sed -i 's/OOMScoreAdjust=-900//' /etc/systemd/system/dbus.service

VOLUME ["/sys/fs/cgroup", "/run", "/tmp"]
ENV container=docker

CMD ["/usr/sbin/init"]

 

Build it for example like this

docker build -t fedora:systemd .

Or use an image I've prepared for you on Docker Hub: vpavlin/fedora:systemd

Following command will do the work:

docker run -it --rm -v /sys/fs/cgroup:/sys/fs/cgroup:ro fedora:systemd

Snímek z 2015-02-24 12:07:15

Some lines will be redundant in the Dockerfile above when F22 will be released so I'll probably update the article when we get there.

By the way, you probably want to continue with next post: Running services with docker and systemd.