Linux – Slow command execution time in Linux

dockerkernellinux

On superuser guys suggested me to go here.

We experience some slow simple command execution time in Linux, possibly related to procfs slowness.

Like simple uptime command can take few seconds to execute.

Here are the inputs:

Platform: AWS
Instance: x1.32xl (128 cores with 2T RAM)
OS: Ubuntu 16.04
Kernel: 4.4.0-1043-aws

We run docker with around 250 containers.

Docker version: 17.09-ce

Utilization:

CPU utilization: <50%
Mem utilization: <50%

Some OS stats:

# cat /proc/loadavg
100.45 108.30 109.41 35/254951 544357

# vmstat 1 3
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
110  2 791584 485552640 50354496 687472448    0    0   426   183    1    1 10  8 82  1  0
13  0 791584 485495104 50353940 687477568    0    0 22820 47984 196555 326943 12 12 75  1  0
33  1 791584 485385632 50352428 687473536    0    0 38932 52892 166486 389428 13 14 72  1  0

# ps axu| wc -l
3792

What exactly happens?

Launching simple commands takes time, when the commands use procfs in any way. Like doing ls in a dir with couple of files gets stuck on open syscall of procfs

# strace -r ls
...
0.000084 open("/proc/filesystems", O_RDONLY) = 3
3.652504 fstat(3, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0
...

Or uptime:

# strace -r uptime
...
0.000035 open("/proc/filesystems", O_RDONLY) = 3
11.014154 fstat(3, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0
...
0.000044 open("/proc/uptime", O_RDONLY) = 3
1.554646 lseek(3, 0, SEEK_SET)
...

Quick Q/A and what we already tried:

This slowness only present on host level. In containers we don't see such issues. We played around a bit and see this issue when we run container with both --ipc=host and --pid=host flags.
We have traced this slowness in procfs to mutex_lock here https://github.com/torvalds/linux/blob/v4.4/fs/namei.c#L3082
Too many containers
- No, we have 600 on other host and all good
Too many processes
- No, we have 10K on other host and all good
Too many threads
- That might be. We don't have such amount of threads on any other host, but we tried to reproduce it on clean x1.32xlarge instance and failed. So it might be treads + smth else.

Any ideas and suggestions are welcome.

Best Answer

We found the root cause.

We ran

/usr/share/bcc/tools/funcslower -m 250 -T proc_pid_readdir

to get processes which were causing long calls to proc_pid_readdir

When we did it, we got few processes:

zabbix_agent
atop
byobu

All of these were making really long calls.

TIME       COMM           PID    LAT(ms)             RVAL FUNC
20:01:35   zabbix_agentd  921144 1258.01                0 proc_pid_readdir 
20:01:38   zabbix_agentd  921144 2692.71                0 proc_pid_readdir 
20:01:39   zabbix_agentd  921145 1276.88                0 proc_pid_readdir

When we stopped ALL of these - we saw great improvement.

So looks like mixing a lot of processes and threads + calling proc_pid_readdir too frequently causes great procfs slowness.

We will try 4.14 kernel today and see if it gets better.

UPDATE:

Kernel 4.14 solved our issue as well. Simple mutex for each inode is replaced now with rwsem.

Related Solutions

Linux – How to list loaded Linux module parameter values

You can do it by using this simple one way command, which uses the /proc/modules and /sys virtual filesystems:

cat /proc/modules | cut -f 1 -d " " | while read module; do \
 echo "Module: $module"; \
 if [ -d "/sys/module/$module/parameters" ]; then \
  ls /sys/module/$module/parameters/ | while read parameter; do \
   echo -n "Parameter: $parameter --> "; \
   cat /sys/module/$module/parameters/$parameter; \
  done; \
 fi; \
 echo; \
done

You will obtain an output like this:

...
...
Module: vboxnetadp

Module: vboxnetflt

Module: vboxdrv
Parameter: force_async_tsc --> 0

Module: binfmt_misc

Module: uinput

Module: fuse
Parameter: max_user_bgreq --> 2047
Parameter: max_user_congthresh --> 2047

Module: md_mod
Parameter: new_array --> cat: /sys/module/md_mod/parameters/new_array: Permission denied
Parameter: start_dirty_degraded --> 0
Parameter: start_ro --> 0

Module: loop
Parameter: max_loop --> 0
Parameter: max_part --> 0

Module: kvm_intel
Parameter: emulate_invalid_guest_state --> N
Parameter: ept --> Y
Parameter: fasteoi --> Y
Parameter: flexpriority --> Y
Parameter: nested --> N
Parameter: ple_gap --> 0
Parameter: ple_window --> 4096
Parameter: unrestricted_guest --> Y
Parameter: vmm_exclusive --> Y
Parameter: vpid --> Y
Parameter: yield_on_hlt --> Y

Module: kvm
Parameter: allow_unsafe_assigned_interrupts --> N
Parameter: ignore_msrs --> N
Parameter: min_timer_period_us --> 500

Module: tpm_infineon

Module: joydev

Module: snd_hda_codec_hdmi
Parameter: static_hdmi_pcm --> N
...
...

Hope this helps.

Linux command line best practices and tips

Use screen, a free terminal multiplexer developed by the GNU Project that will allow you to have several terminals in one.

You can start a session and your terminals will be saved even when you connection is lost, so you can resume later or from home.

Best Answer

Related Solutions

Linux – How to list loaded Linux module parameter values

Linux command line best practices and tips

Related Topic