I'm going to dispel a few myths here.
This is just a bad idea. I'm sorry. – Jacob Mar 5 at 20:30
I don't see how this is a bad idea. It's really just a chroot inside a chroot. On one hand, it could possibly decrease performance in some negligible manner (nothing compared to running a VM inside a VM). On the other hand, it's likely to be more secure (e.g. more isolated from the root host system and it's constituents).
Do you actually have a real reason to do this? Please remember that questions here should be about actual problems that you face. – Zoredache Mar 5 at 21:52
I agree 100% with the poster's following comment. Furthermore, I think it's safe to assume that everybody who posts a question on here likely thinks that they have a real reason to do [ it ]..
I think, that lxc should be able to simplify VM migration(and backup+recovery too). But I'm not sure about cases, when there is no access to host OS(cheap vps for example). – Mikhail Mar 6 at 11:17
I actually came across this question back in June when I was first diving into LXC for PaaS/IaaS projects, and I was particularly interested in the ability to allow users to emulate cloud environments for development purposes.
LXCeption. We're too deep. – Tom O'Connor Mar 6 at 22:46
I laughed a little bit when I read this one, but that's not, at all, the case :)
Anyway, I eventually set up a VirtualBox environment with a stock install of Ubuntu 12.04 LTS Server Edition after reading all this, thinking that this was 100% possible. After installing LXC, I created a new container, and installed LXC inside the container with apt-get. Most of the installation progressed well, but resulted in error eventually due to a problem with the cgroup-lite package, whose upstart job failed to start after the package had been installed.
After a bit of searching, I came across this fine article at stgraber.org (the goodies are hiding under the "Container Nesting" section):
sudo apt-get install lxc
sudo lxc-create -t ubuntu -n my-host-container -t ubuntu
sudo wget https://www.stgraber.org/download/lxc-with-nesting -O /etc/apparmor.d/lxc/lxc-with-nesting
sudo /etc/init.d/apparmor reload
sudo sed -i "s/#lxc.aa_profile = unconfined/lxc.aa_profile = lxc-container-with-nesting/" /var/lib/lxc/my-host-container/config
sudo lxc-start -n my-host-container
(in my-host-container) sudo apt-get install lxc
(in my-host-container) sudo stop lxc
(in my-host-container) sudo sed -i "s/10.0.3/10.0.4/g" /etc/default/lxc
(in my-host-container) sudo start lxc
(in my-host-container) sudo lxc-create -n my-sub-container -t ubuntu
(in my-host-container) sudo lxc-start -n my-sub-container
Installing that AppArmor policy and restarting the daemon did the trick (don't forget to change the network ranges, though!). In fact, I thought that particular snippet was so important that I mirrored it @ http://pastebin.com/JDFp6cTB just in case the article ever goes offline.
After that, sudo /etc/init.d/cgroup-lite start
succeeded and it was smooth sailing.
So, yes, it is possible to start an LXC container inside of another LXC container :)
A better way to make your change permanent is to use sysctl instead of writing to /proc directly since that is the standard way to configure kernel parameters at runtime so they are set correctly at next boot:
# cat >> /etc/sysctl.d/99-bridge-nf-dont-pass.conf <<EOF
net.bridge.bridge-nf-call-ip6tables = 0
net.bridge.bridge-nf-call-iptables = 0
net.bridge.bridge-nf-call-arptables = 0
net.bridge.bridge-nf-filter-vlan-tagged = 0
EOF
# service procps start
As for the answer to the question in your update...
bridge-netfilter (or bridge-nf) is a very simple bridge for IPv4/IPv6/ARP packets (even in 802.1Q VLAN or PPPoE headers) that provides the functionality for a stateful transparent firewall, but more advanced functionality like transparent IP NAT is provided by passing those packets to arptables/iptables for further processing-- however even if the more advanced features of arptables/iptables is not need, passing packets to those programs is still turned on by default in the kernel module and must be turned off explicitly using sysctl.
What are they here for? These kernel configuration options are here to either pass (1) or don't pass (0) packets to arptables/iptables as described in the bridge-nf FAQ:
As of kernel version 2.6.1, there are three sysctl entries for bridge-nf behavioral control (they can be found under /proc/sys/net/bridge/):
bridge-nf-call-arptables - pass (1) or don't pass (0) bridged ARP traffic to arptables' FORWARD chain.
bridge-nf-call-iptables - pass (1) or don't pass (0) bridged IPv4 traffic to iptables' chains.
bridge-nf-call-ip6tables - pass (1) or don't pass (0) bridged IPv6 traffic to ip6tables' chains.
bridge-nf-filter-vlan-tagged - pass (1) or don't pass (0) bridged vlan-tagged ARP/IP traffic to arptables/iptables.
Is it safe to disable all bridge-nf-*? Yes, it is not only safe to do so, but there is a recommendation for distributions to turn it off by default to help people avoid confusion for the kind of problem you encountered:
In practice, this can lead to serious confusion where someone creates
a bridge and finds that some traffic isn't being forwarded across the
bridge. Because it's so unexpected that IP firewall rules apply to
frames on a bridge, it can take quite some time to figure out what's
going on.
and to increase security:
I still think the risk with bridging is higher, especially in the
presence of virtualisation. Consider the scenario where you have two
VMs on the one host, each with a dedicated bridge with the intention
that neither should know anything about the other's traffic.
With conntrack running as part of bridging, the traffic can now
cross over which is a serious security hole.
UPDATE: May 2015
If you are running a kernel older than 3.18, then you may be subject to the old behavior of bridge filtering enabled by default; if you newer than 3.18, then you can still be bitten by this if you've loaded the bridge module and haven't disabled the bridge filtering. See:
https://bugzilla.redhat.com/show_bug.cgi?id=634736#c44
After all these years of asking for the default of bridge filtering to
be "disabled" and the change being refused by the kernel maintainers,
now the filtering has been moved into a separate module that isn't
loaded (by default) when the bridge module is loaded, effectively
making the default "disabled". Yay!
I think this is in the kernel as of 3.17 (It definitely is in kernel
3.18.7-200.fc21, and appears to be in git prior to the tag "v3.17-rc4")
Best Answer
Try getting debug output from lxc:
there might be more information.
Also check lxc version, there's been quite huge progress in development lately, if you're running something like
lxc 1.0.0.alpha1
or earlier version you should consider upgrading.