Centos – loop0 command eating 100% of the CPU

apache-2.2centos

Today I noticed that my server was becoming very slow.
I checked it thru top command, and I got:

top - 21:49:32 up 25 days,  9:13,  1 user,  load average: 1238.23, 825.34, 502.3
Tasks: 1815 total, 145 running, 1666 sleeping,   0 stopped,   4 zombie
Cpu(s):  1.3%us, 98.0%sy,  0.0%ni,  0.0%id,  0.4%wa,  0.0%hi,  0.4%si,  0.0%st
Mem:  12290984k total, 12252988k used,    37996k free,    30756k buffers
Swap:  1052248k total,   428116k used,   624132k free,   981528k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND            
 3129 root       5 -20     0    0    0 R 77.8  0.0  34:10.25 loop0              
 2281 nobody    18   0  163m  11m 3128 R 55.6  0.1   0:02.93 httpd              
 2021 nobody    19   0  162m  11m 3552 R 44.9  0.1   0:03.07 httpd              
  561 nobody    18   0  163m  11m 3172 R 44.4  0.1   0:02.03 httpd              
 2085 nobody    17   0  163m  11m 3176 R 41.4  0.1   0:03.22 httpd              
 1116 nobody    18   0  162m  11m 3168 R 37.2  0.1   0:02.38 httpd              
31809 nobody    18   0  163m  12m 3500 R 36.2  0.1   0:02.10 httpd              
 1906 nobody    17   0  161m 9364 1936 R 35.7  0.1   0:13.15 httpd              
31979 nobody    17   0  162m  11m 3404 R 30.7  0.1   0:04.41 httpd              
32610 nobody    18   0  161m 9688 2344 R 29.9  0.1   0:11.07 httpd              
 2326 nobody    17   0  162m  11m 3428 R 28.7  0.1   0:02.18 httpd              
  565 root      20  -5     0    0    0 R 27.4  0.0   4:29.02 kswapd0            
 2183 nobody    19   0  162m  11m 3100 R 26.4  0.1   0:02.55 httpd              
 1998 nobody    17   0  162m  10m 2484 R 24.7  0.1   0:10.76 httpd              
28515 nobody    16   0  169m  16m 5416 R 23.4  0.1   0:02.75 httpd              
 2056 nobody    19   0  166m  14m 5776 R 22.2  0.1   0:02.95 httpd              
32379 nobody    16   0  164m  12m 4376 R 20.7  0.1   0:01.52 httpd

Id like to know what is wrong. I think it's related to the /tmp directory

root@server [~]# mount
/dev/sda2 on / type ext3 (rw,usrquota)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
/dev/sda1 on /boot type ext3 (rw)
tmpfs on /dev/shm type tmpfs (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
/usr/tmpDSK on /tmp type ext3 (rw,noexec,nosuid,loop=/dev/loop0)
/tmp on /var/tmp type none (rw,noexec,nosuid,bind)
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)


root@server [~]# losetup -a
/dev/loop0: [0802]:103095300 (/usr/tmpDSK)

Best Answer

This line from the mount output is relevant:

/usr/tmpDSK on /tmp type ext3 (rw,noexec,nosuid,loop=/dev/loop0)

What this shows is that your /tmp file system is using a loopback mount, and that is the reason the loop0 process is showing up. That's an unusual configuration, which is probably not the ideal configuration. It does mean that everything accessing /tmp will have to be handled by the loop0 process if the data is not already in cache.

The output from top shows an excessively high load average of 1238.23 but you (only) have 145 processes in running state. If those two numbers are stable it would indicate that you have more than 1000 processes blocked waiting for I/O. How many of those blocked processes are waiting for loop0 to do some work cannot be determined from the shown output alone.

Given the large amount of used memory and the small numbers for free, buffers, and cached I would conclude that that system is under significant memory pressure. It is a surprise that it hasn't used all of the swap space yet.

I would add some more RAM to that server. And I would stop using loopback for /tmp. If the loopback device was set up because / was running out of disk space and /usr had space to share, there is a better way to use some of the space in /usr for /tmp. You can create a /usr/local/tmp directory and bind mount that to /tmp. A bind mount does not have the overhead of needing a loopback device.

Related Solutions

Centos – MySQL Eating CPU Usage (Urgent)

In my.cnf add the following lines under [mysqld]:
long_query_time = 5 log-slow-queries

Then restart MySQL and look for any queries showing up in there (any that take over 5 seconds to run). It's also possible that MySQL isn't actually the problem, it's Apache hammering it.

Linux – Java process eating CPU; Why

If you stop the solr process and a java process is still running, then there's another java process on your server. The first step is to document all of the Java processes that are running. A good tool for this on Unix is the ps tool. Try this:

$ ps auxwww | grep java

That output should show you all of the java processes running, and the commands that are being executed. Try this before and after you stop solr.

The second question is "which of these java processes is eating up so much CPU"? You may need to stop the jetty process in addition to the solr process. Also, you should never really need to restart a Linux server just because a single process is misbehaving. You can also use the kill command to stop a process if you now it's process id, which you can get from either top or ps.

In the short term it may be a good idea to install some sort of "watchdog" script on your machine to help with these situations. For example, monit can be used to automatically restart a service or process when it consumes a certain amount of CPU resources.

In the long term, I'm sorry to say that you have a performance issue. You need to look at reconfiguring solr and jetty at the very least. You may also need to look into garbage collection tuning and possibly adding more hardware. There's lots of information on these topics online, and I'm sorry to say that this process can be somewhat difficult.

Good luck!

Tom Purl

Best Answer

Related Solutions

Centos – MySQL Eating CPU Usage (Urgent)

Linux – Java process eating CPU; Why

Related Topic