You got an assumption wrong about how volumes work in docker. I'll try to explain how volumes relates to docker containers and docker images and hopefully differences between data volumes and data volume containers will become clear.
First let's recall a few definitions
Docker images
Docker images are essentially a union filesystem + metadata. You can inspect the content of docker image union filesystem with the docker export
command, and you can inspect a docker image metadata with the docker inspect
command.
Data volumes
from the Docker user guide:
A data volume is a specially-designated directory within one or more containers that bypasses the Union File System to provide several useful features for persistent or shared data.
It is important to note here that a given volume (as the directory or file that contains data) is reusable only if it exists at least one docker container using it. Docker images don't have volumes, they only have metadata which eventually tells where volumes would be mounted on the union filesystem. Data volumes aren't either part of docker containers union filesystem, so where are they? under /var/lib/docker/volumes
on the docker host (while containers are stored under /var/lib/docker/containers
).
Data volume containers
That special type of container has nothing special. They are just stopped containers using a data volume with the sole and unique goal of having at least one container using that data volume. Remember, as soon as the last container (running or stopped) using a given data volume is deleted, that volume will become unreachable through the docker run --volumes-from
option.
Working with data volume containers
How to create a data volume container
The image used to create a data volume container has no importance as such a container can remain stopped and still fill its purpose. So to create a data container named datatest_data
for a volume in /datafolder
you only need to run:
docker run --name datatest_data --volume /datafolder busybox true
Here base
is the image name (a conveniently small one) and true
is a command we provide just to avoid seeing the docker daemon complain about a missing command. Anyway after you have a stopped container named datatest_data
with the sole purpose of allowing you to reach that volume with the --volumes-from
option of the docker run
command.
How to read from a data volume container
I know two ways of reading a data volume: the first one is through a container. If you cannot have a shell into an existing container to access that data volume, you can run a new container with the --volumes-from
option for the sole purpose of reading that data.
For instance:
docker run --rm --volumes-from datatest_data busybox cat /datafolder/data.txt
The other way is to copy the volume from the /var/lib/docker/volumes
folder. You can discover the name of the volume in that folder by inspecting the metadata of one of the container using the volume. See this answer for details.
Working with volumes (since Docker 1.9.0)
How to create a volume (since Docker 1.9.0)
Docker 1.9.0 introduced a new command docker volume
which allows to create volumes :
docker volume create --name hello
How to read from a volume (since Docker 1.9.0)
Let say you created a volume named hello
with docker volume create --name hello
, you can mount it in a container with the -v
option :
docker run -v hello:/data busybox ls /data
About committing & pushing containers
It should now be clear that since data volumes aren't part of a container (the union filesystem), committing a container to produce a new docker image won't persist any data that would be in a data volume.
Making backups of data volumes
The docker user guide has a nice article about making backups of data volumes.
Good article reagarding volumes: http://container42.com/2014/11/03/docker-indepth-volumes/
The ideal target scenario
Yes, you should use a load balancer and update one instance at a time. I'm not sure where inter-container communication comes in.
As an example, imagine you have a load balancer which serves your site A. Users only connect to it as and only know it as "A". The load balancer knows that there are two or more backends (B, C, etc.), and whether they're VMs or containers doesn't matter.
Then, you want to upgrade the backends, which in this case are Apache instances.
- take B out of the eligible backends for the load balancer so it's no longer accepting any traffic.
- wait for the currently-live requests to be served and existing connections closed.
- update the container or underlying VM that serves B
- restart B, wait for it to load and start working
- test B to make sure it's serving new requests properly
- add B back to the load balancer backend pool to re-enable traffic
Then, do the same process for C, D, etc.
Note that there's an open request for in-place upgrades of Docker containers, from Nov 2013, but it doesn't appear to have much progress so the above solution is what you should do in the mean time.
What to do for an existing live site
Presumably, you're asking this because you're already running a live site in this model and you would like to upgrade it without downtime. So, we need to get to the ideal target state above, but incrementally.
Let's assume that:
- you have a DNS name pointing to your container
- your container runs on some IP address
- your users don't know the container's IP address and it's not hard-coded anywhere
If these assumptions are false, you should first fix it such that this is correct.
Then, follow these steps:
- create a load balancer at a new IP and point it at the existing container as its only backend
- change DNS to point to the load balancer rather than the container IP directly
- add an identical Apache backend with the same VM + container setup
- now you have a load balancer with two backends B and C, so follow the directions in the "ideal target scenario" section for upgrading them one at-a-time
How to update a load balancer
The easy (hosted) way
The easiest option is to not run your own balancer. For example, if you're using a cloud platform which provides load balancing as a service, consider using it and then maintenance and update of the load balancer is not an issue.
The manual way
If you are running your own load balancer, adding another layer of indirection (i.e., DNS) will help. Let's assume the following:
- that we have a host name resolving to the IP of our load balancer A which we would like to update
- our load balancer has a backend pool of P1, P2, etc.
We proceed as follows:
and you're done.
Details, diagrams, and tooling
See these write-ups and tools that can help you automate the process, but the general idea is the same:
The Moral
"All problems in computer science can be solved by another level of indirection, except of course for the problem of too many indirections." — David Wheeler
Best Answer
You should put restart policy and stop_grace_period on your compose file: