Linux – Slow command execution time in Linux

dockerkernellinux

On superuser guys suggested me to go here.

We experience some slow simple command execution time in Linux, possibly related to procfs slowness.

Like simple uptime command can take few seconds to execute.

Here are the inputs:

  • Platform: AWS
  • Instance: x1.32xl (128 cores with 2T RAM)
  • OS: Ubuntu 16.04
  • Kernel: 4.4.0-1043-aws

We run docker with around 250 containers.

  • Docker version: 17.09-ce

Utilization:

  • CPU utilization: <50%
  • Mem utilization: <50%

Some OS stats:

# cat /proc/loadavg
100.45 108.30 109.41 35/254951 544357

# vmstat 1 3
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
110  2 791584 485552640 50354496 687472448    0    0   426   183    1    1 10  8 82  1  0
13  0 791584 485495104 50353940 687477568    0    0 22820 47984 196555 326943 12 12 75  1  0
33  1 791584 485385632 50352428 687473536    0    0 38932 52892 166486 389428 13 14 72  1  0

# ps axu| wc -l
3792

What exactly happens?

Launching simple commands takes time, when the commands use procfs in any way. Like doing ls in a dir with couple of files gets stuck on open syscall of procfs

# strace -r ls
...
0.000084 open("/proc/filesystems", O_RDONLY) = 3
3.652504 fstat(3, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0
...

Or uptime:

# strace -r uptime
...
0.000035 open("/proc/filesystems", O_RDONLY) = 3
11.014154 fstat(3, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0
...
0.000044 open("/proc/uptime", O_RDONLY) = 3
1.554646 lseek(3, 0, SEEK_SET)
...

Quick Q/A and what we already tried:

  • This slowness only present on host level. In containers we don't see such issues. We played around a bit and see this issue when we run container with both --ipc=host and --pid=host flags.
  • We have traced this slowness in procfs to mutex_lock here https://github.com/torvalds/linux/blob/v4.4/fs/namei.c#L3082
  • Too many containers
    • No, we have 600 on other host and all good
  • Too many processes
    • No, we have 10K on other host and all good
  • Too many threads
    • That might be. We don't have such amount of threads on any other host, but we tried to reproduce it on clean x1.32xlarge instance and failed. So it might be treads + smth else.

Any ideas and suggestions are welcome.

Best Answer

We found the root cause.

We ran

/usr/share/bcc/tools/funcslower -m 250 -T proc_pid_readdir

to get processes which were causing long calls to proc_pid_readdir

When we did it, we got few processes:

zabbix_agent
atop
byobu

All of these were making really long calls.

TIME       COMM           PID    LAT(ms)             RVAL FUNC
20:01:35   zabbix_agentd  921144 1258.01                0 proc_pid_readdir 
20:01:38   zabbix_agentd  921144 2692.71                0 proc_pid_readdir 
20:01:39   zabbix_agentd  921145 1276.88                0 proc_pid_readdir

When we stopped ALL of these - we saw great improvement.

So looks like mixing a lot of processes and threads + calling proc_pid_readdir too frequently causes great procfs slowness.

We will try 4.14 kernel today and see if it gets better.

UPDATE:

Kernel 4.14 solved our issue as well. Simple mutex for each inode is replaced now with rwsem.