Docker events
command may help and Docker logs command can fetch logs even after the image failed to start.
First start docker events
in the background to see whats going on.
docker events&
Then run your failing docker run ...
command.
Then you should see something like the following on screen:
2015-12-22T15:13:05.503402713+02:00 xxxxxxxacd8ca86df9eac5fd5466884c0b42a06293ccff0b5101b5987f5da07d: (from xxx/xxx:latest) die
Then you can get the startup hex id from previous message or the output of the run command. Then you can use it with the logs command:
docker logs <copy the instance id from docker events messages on screen>
You should now see some output from the failed image startup.
As @alexkb suggested in a comment: docker events&
can be troublesome if your container is being constantly restarted from something like AWS ECS service. In this scenario it may be easier to get the container hex id out of the logs in /var/log/ecs/ecs-agent.log.<DATE>
. Then use docker logs <hex id>
.
The ideal target scenario
Yes, you should use a load balancer and update one instance at a time. I'm not sure where inter-container communication comes in.
As an example, imagine you have a load balancer which serves your site A. Users only connect to it as and only know it as "A". The load balancer knows that there are two or more backends (B, C, etc.), and whether they're VMs or containers doesn't matter.
Then, you want to upgrade the backends, which in this case are Apache instances.
- take B out of the eligible backends for the load balancer so it's no longer accepting any traffic.
- wait for the currently-live requests to be served and existing connections closed.
- update the container or underlying VM that serves B
- restart B, wait for it to load and start working
- test B to make sure it's serving new requests properly
- add B back to the load balancer backend pool to re-enable traffic
Then, do the same process for C, D, etc.
Note that there's an open request for in-place upgrades of Docker containers, from Nov 2013, but it doesn't appear to have much progress so the above solution is what you should do in the mean time.
What to do for an existing live site
Presumably, you're asking this because you're already running a live site in this model and you would like to upgrade it without downtime. So, we need to get to the ideal target state above, but incrementally.
Let's assume that:
- you have a DNS name pointing to your container
- your container runs on some IP address
- your users don't know the container's IP address and it's not hard-coded anywhere
If these assumptions are false, you should first fix it such that this is correct.
Then, follow these steps:
- create a load balancer at a new IP and point it at the existing container as its only backend
- change DNS to point to the load balancer rather than the container IP directly
- add an identical Apache backend with the same VM + container setup
- now you have a load balancer with two backends B and C, so follow the directions in the "ideal target scenario" section for upgrading them one at-a-time
How to update a load balancer
The easy (hosted) way
The easiest option is to not run your own balancer. For example, if you're using a cloud platform which provides load balancing as a service, consider using it and then maintenance and update of the load balancer is not an issue.
The manual way
If you are running your own load balancer, adding another layer of indirection (i.e., DNS) will help. Let's assume the following:
- that we have a host name resolving to the IP of our load balancer A which we would like to update
- our load balancer has a backend pool of P1, P2, etc.
We proceed as follows:
and you're done.
Details, diagrams, and tooling
See these write-ups and tools that can help you automate the process, but the general idea is the same:
The Moral
"All problems in computer science can be solved by another level of indirection, except of course for the problem of too many indirections." — David Wheeler
Best Answer
Not a solution but a workaround:
I have the same problem on CoreOS 607.0.0 and reproduced this problem in containers based on Ubuntu or Fedora. However containers that use busybox do not have this issue. Two workarounds:
1) use a busybox-based container image such as alpine
2) install busybox in your existing container and run