On superuser guys suggested me to go here.
We experience some slow simple command execution time in Linux, possibly related to procfs slowness.
Like simple uptime
command can take few seconds to execute.
Here are the inputs:
- Platform: AWS
- Instance: x1.32xl (128 cores with 2T RAM)
- OS: Ubuntu 16.04
- Kernel: 4.4.0-1043-aws
We run docker with around 250 containers.
- Docker version: 17.09-ce
Utilization:
- CPU utilization: <50%
- Mem utilization: <50%
Some OS stats:
# cat /proc/loadavg
100.45 108.30 109.41 35/254951 544357
# vmstat 1 3
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
110 2 791584 485552640 50354496 687472448 0 0 426 183 1 1 10 8 82 1 0
13 0 791584 485495104 50353940 687477568 0 0 22820 47984 196555 326943 12 12 75 1 0
33 1 791584 485385632 50352428 687473536 0 0 38932 52892 166486 389428 13 14 72 1 0
# ps axu| wc -l
3792
What exactly happens?
Launching simple commands takes time, when the commands use procfs in any way. Like doing ls
in a dir with couple of files gets stuck on open syscall of procfs
# strace -r ls
...
0.000084 open("/proc/filesystems", O_RDONLY) = 3
3.652504 fstat(3, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0
...
Or uptime
:
# strace -r uptime
...
0.000035 open("/proc/filesystems", O_RDONLY) = 3
11.014154 fstat(3, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0
...
0.000044 open("/proc/uptime", O_RDONLY) = 3
1.554646 lseek(3, 0, SEEK_SET)
...
Quick Q/A and what we already tried:
- This slowness only present on host level. In containers we don't see such issues. We played around a bit and see this issue when we run container with both
--ipc=host
and--pid=host
flags. - We have traced this slowness in procfs to mutex_lock here https://github.com/torvalds/linux/blob/v4.4/fs/namei.c#L3082
- Too many containers
- No, we have 600 on other host and all good
- Too many processes
- No, we have 10K on other host and all good
- Too many threads
- That might be. We don't have such amount of threads on any other host, but we tried to reproduce it on clean x1.32xlarge instance and failed. So it might be treads + smth else.
Any ideas and suggestions are welcome.
Best Answer
We found the root cause.
We ran
to get processes which were causing long calls to
proc_pid_readdir
When we did it, we got few processes:
All of these were making really long calls.
When we stopped ALL of these - we saw great improvement.
So looks like mixing a lot of processes and threads + calling
proc_pid_readdir
too frequently causes great procfs slowness.We will try 4.14 kernel today and see if it gets better.
UPDATE:
Kernel 4.14 solved our issue as well. Simple mutex for each inode is replaced now with
rwsem
.