What does the vSphere 8 core maximum mean

vcpuvmware-vsphere

I'm configuring software that will need about 12 cores to run, but I am running it on a vSphere virtual machine, meaning that at most 8 (virtual) cores can be configured.[1] The physical machine that this will be running on has 24 cores available, and hosts only one other virtual machine. That VM has 8 cores allocated as well.

Since my machine has 8 cores, and the other machine has 8 cores, this means that 8 cores seem to be left completely idle. This seems wrong.

I'm guessing that VMWare has done something clever — That even though I can only have 8 cores allocated to my machine, in reality, if there are 24 cores on the back end, then I am guaranteed to get 8 of them, but can use 24 of them if no other machine is using them.

I've been reading about co-sharing in vSphere,[2] but it's a bit over my head.

Can anybody explain how this works?

Edit: This is the explanation I've been given about the 8 CPU limit, but I need to confirm it.

vSphere uses dynamic processor load balancing that allows each assigned core in a guest access to all cores on the host. While the guest OS will only see 8 physical processors, each processor has access to a pool of 24 cores. This is very similar to how a mainframe works.

This hints at the 8 cores behaving more like 24 cores, but now this just seems wrong. Is it?

References:

Best Answer

What this means is that you'll never have more than 8 threads executing in parallel in your virtual machine. However, through the magic of ESX resource allocation you can give those threads quite a bit of horsepower. You're not limited to the max-rate of your actual CPUs, ESX's CPU load-balancing methods will permit running faster than that... so long as what you're doing with it is able to do such slightly out of order execution.

This is accomplished by leveraging a queue structure for CPU resources. Work is dispatched to multiple processors as needed. A single vCPU may execute on any of the physical CPUs on the system (or the CPUs in the local-node if in a NUMA system). Achieving vCPU performance in excess of physical CPU performance is done by dispatching work from a single vCPU to multiple physical CPUs in parallel or at least very close together.

When it comes time to reassemble work returned by multiple CPUs to emulate a single vCPU, it does look to the VM like a single CPU working very fast. ESX reassembles the multiple work-units into the correct order.

Not all workloads are well suited to this, of course. Jobs that submit a lot of iterative work that is loosely related to each other if at all is perhaps the best case for this. Jobs that involve lots of tight dependencies with earlier instructions, such as with the recursive calls of crypto algorithms, won't be able to scale nearly as far.