Linux – Why would only half of the cores on the hyperthreaded linux server be loaded

hyperthreadinglinux

I have a server which is a 12 core hyperthreaded system, which means I have 24 virtual cores.

I'm running 24 processes on my server, each listening on its own port and doing the same things, albeit from different clients and different requests. The process is a python script which was built using gevent for concurrency while waiting for network operations to complete. top and htop show each of the processes using about the same CPU and memory. Since I am running the same # of processes as cores I would expect all of the cores to be loaded about the same. However, I am seeing only half of the cores having any real load on them (the rest show minimal load).

What is ever more odd to me is that it is always the same cores, 6-11 and 18-23. What's more, I have three of the same servers doing about the same thing and under the same load and all 3 are using the same cores at about the same load. Does anyone know why this would be?

Here's the sar output from one of these servers:

04:34:01 PM     CPU     %user     %nice   %system   %iowait    %steal     %idle
04:35:01 PM     all     18.67      0.00      3.65      0.01      0.00     77.68
04:35:01 PM       0      9.24      0.00      0.76      0.00      0.00     89.99
04:35:01 PM       1      3.16      0.00      0.55      0.00      0.00     96.30
04:35:01 PM       2      1.40      0.00      0.66      0.00      0.00     97.94
04:35:01 PM       3      0.46      0.00      0.12      0.00      0.00     99.42
04:35:01 PM       4      0.15      0.00      0.12      0.00      0.00     99.73
04:35:01 PM       5      0.35      0.00      0.81      0.00      0.00     98.84
04:35:01 PM       6     44.19      0.00     10.05      0.02      0.00     45.74
04:35:01 PM       7     43.99      0.00     10.84      0.02      0.00     45.15
04:35:01 PM       8     27.00      0.00      2.57      0.09      0.00     70.33
04:35:01 PM       9     40.91      0.00      9.02      0.02      0.00     50.06
04:35:01 PM      10     41.97      0.00     10.27      0.00      0.00     47.77
04:35:01 PM      11     33.52      0.00      5.26      0.02      0.00     61.21
04:35:01 PM      12      0.53      0.00      0.10      0.00      0.00     99.37
04:35:01 PM      13      0.32      0.00      0.08      0.00      0.00     99.60
04:35:01 PM      14      0.22      0.00      0.10      0.00      0.00     99.68
04:35:01 PM      15      0.13      0.00      0.10      0.00      0.00     99.77
04:35:01 PM      16      0.12      0.00      0.05      0.00      0.00     99.83
04:35:01 PM      17      0.13      0.00      0.30      0.00      0.00     99.57
04:35:01 PM      18     16.54      0.00      1.49      0.00      0.00     81.97
04:35:01 PM      19     36.16      0.00      5.85      0.02      0.00     57.98
04:35:01 PM      20     29.22      0.00      4.97      0.10      0.00     65.71
04:35:01 PM      21     32.86      0.00      5.25      0.02      0.00     61.87
04:35:01 PM      22     43.01      0.00      9.19      0.00      0.00     47.80
04:35:01 PM      23     39.63      0.00      8.61      0.02      0.00     51.74

And here is the output from /proc/cpuinfo for one of the cores:

processor       : 23
vendor_id       : GenuineIntel
cpu family      : 6
model           : 44
model name      : Intel(R) Xeon(R) CPU           X5675  @ 3.07GHz
stepping        : 2
cpu MHz         : 1600.000
cache size      : 12288 KB
physical id     : 1
siblings        : 12
core id         : 10
cpu cores       : 6
apicid          : 53
initial apicid  : 53
fpu             : yes
fpu_exception   : yes
cpuid level     : 11
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc aperfmperf pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 popcnt lahf_lm ida arat tpr_shadow vnmi flexpriority ept vpid
bogomips        : 6133.17
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management:

These systems also have ~24GB of RAM, of which less than 4GB is being used, and do not show any swap activity. There is also very little disk activity, almost all of what these servers do is network-bound, about 60-80MB/s each, in and out dual gigbit ethernet cards, bonded to a single interface.

Best Answer

This is because it is a hyperthreaded server. Half of the CPUs are only "virtual". So Linux tries to avoid these virtual CPUs and concentrate on the real ones.

As you system is not under load you can't see that the others will be used on higher load. Try it and increase the load. You will see the difference.