Security – Can a virtual machine (VM) “hack” another VM running on the same physical machine

hackinghypervisorSecurityvirtual-machinesvirtualization

Questions:

  • if a VM is corrupted (hacked), what do I risk on others VMs running on the same physical machine?
  • What kind of security issues is there between VMs running on the same physical host?
  • Is there (can you make) a list of those (potential) weaknesses and/or issues?

Warning:

I know many virtualization types/solutions exist, and may have different weaknesses. However, I'm mostly looking for general security issues about the virtualization techniques, rather than a particular vendor bug.

Please provide real facts, (serious) studies, experienced issues or technical explanations. Be specific. Do not (only) give your opinion.

  • Examples:

Two years ago, I've heard that there could be security issues related to the MMU (accessing other machines main memory, I think), but I don't know if that is a practical threat as of today, or just a theoretical research subject.

EDIT: I also found this "Flush+Reload" attack capable of retrieving GnuPG secret keys on the same physical machine, by exploiting the L3 CPU cache, even if GnuPG runs on another VM. GnuPG has been patched since.

Best Answer

Of course it is possible to exploit another VM running on the same hardware, given a working exploit. Additionally, one can exist. Your question cites some recent work showing one. I'm not going to share any specific exploits or PoC here, but I'll gladly say how they can be made.

The exploits that are used in this context are naturally different from ones that function when you're running on the same machine you are trying to exploit a service on, and they tend to be quite a bit harder due to the increased isolation. However, some general approaches that can be used to accomplish such an exploit include:

  • Attack the hypervisor. If you can get a sufficiently privileged shell on the hypervisor given a VM, you can gain control over any VM on the system. The way to approach this is to look for data flows that exist from the VM into the hypervisor, and are highly hypervisor-dependant; things like paravirtualized drivers, clipboard sharing, display output, and network traffic tend to create this type of channel. For instance, a malicious call to a paravirtualized network device might lead to arbitrary code execution in the hypervisor context responsible for passing that traffic to the physical NIC driver.
  • Attack the hardware on the host. Many devices allow for firmware updates, and if it happens to be possible to access the mechanism for that from a VM, you could upload new firmware that favours your intentions. For instance, if you are permitted to update the firmware on the NIC, you could cause it to duplicate traffic bound for one MAC address (the victim's), but with another destination MAC address (yours). For this reason many hypervisors filter such commands where possible; ESXi filters CPU microcode updates when they originate from a VM.
  • Attack the host's architecture. The attack you cited, essentially yet another timing-based key disclosure attack, does this: it exploits the caching mechanism's impact on operation timing to discern the data being used by the victim VM in its operations. At the core of virtualization is the sharing of components; where a component is shared, the possibility of a side channel exists. To the extent that another VM on the same host is able to influence the behaviour of the hardware while running in the victim VM's context, the victim VM is controlled by the attacker. The referenced attack makes use of the VM's ability to control the behaviour of the CPU cache (essentially shared universal state) so that the victim's memory access times more accurately reveal the data it is accessing; wherever shared global state exists, the possibility of a disclosure exists also. To step into the hypothetical to give examples, imagine an attack which massages ESXi's VMFS and makes parts of virtual volumes reference the same physical disk addresses, or an attack which makes a memory ballooning system believe some memory can be shared when in fact it should be private (this is very similar to how use-after-free or double-allocation exploits work). Consider a hypothetical CPU MSR (model-specific register) which the hypervisor ignores but allows access to; this could be used to pass data between VMs, breaking the isolation the hypervisor is supposed to provide. Consider also the possibility that compression is used so that duplicate components of virtual disks are stored only once - a (very difficult) side channel might exist in some configurations where an attacker can discern the contents of other virtual disks by writing to its own and observing what the hypervisor does. Of course a hypervisor is supposed to guard against this and the hypothetical examples would be critical security bugs, but sometimes these things slip through.
  • Attack the other VM directly. If you have a proximal host to the victim VM, you may be able to take advantage of relaxed access control or intentional inter-VM communication depending on how the host is configured and what assumptions are made when deploying access control. This is only slightly relevant, but it does bear mention.

Specific attacks will arise and be patched as time goes on, so it isn't ever valid to classify some particular mechanism as being exploitable, exploitable only in lab conditions, or unexploitable. As you can see, the attacks tend to be involved and difficult, but which ones are feasible at a particular time is something that changes rapidly, and you need to be prepared.

That said, the vectors I've mentioned above (with the possible exception of the last one in certain cases of it) simply don't exist in bare-metal environments. So yes, given that security is about protecting against the exploits you don't know about and that aren't in the wild as well as the ones which have been publicly disclosed, you may gain a little security by running in bare metal or at least in an environment where the hypervisor doesn't host VMs for all and sundry.

In general, an effective strategy for secure application programming would be to assume that a computer has other processes running on it that might be attacker-controlled or malicious and use exploit-aware programming techniques, even if you think you are otherwise assuring no such process exists in your VM. However, particularly with the first two categories, remember that he who touches the hardware first wins.