Linux – System not being able to handle soft interrupts but having idle time

cpu-usagelinuxload-averagetop

I have a constant 5% and more CPU time spent in handling soft interrupts. Doe to that, the ksoftirqd is running almost constantly, but is using a very small amount of CPU (less than 1%).

However, regardless of this heavy load there is still a fairly high percentage of idle time (30% and more)(this is the top value for idle, or the idle from mpstat).

Some background (However, I would like a conceptual answer, not one that solves the problem on my system). The system is used for routing (echo 1 > /proc/sys/net/ipv4/ip_forward) and NAT with iptables, and runs additional user space application not related to networking. Also, the load average is always above 1 (it's a single core processor)(this is the value Load average from top, or the output of sar -q).

What is preventing the system to use the idle time to prevent the handling of soft interrupts from being missed?

I would expect to see the idle time (id in top) be used for serving software interrupts (si in top) and not have the processor miss tasks and be idle at the same time.

Best Answer

There is no heavy load on your system.

The interrupts are handled correctly, just as your routing and your application. If it weren't so, your system didn't have 30% idle. (Anyways, where do you see that?)

Using a small system to many different things simultanously doesn't always mean an overload, especially if there is not too many data to NAT. If your network interfaces are working with DMA (which is very probable), then your interrupt handlers aren't even doing a single block copy in current kernels.

Next to that, you are loading your system very differently (while your network card chip is talking with the dma, your application can work).

The only major problem could be in your system the big ratio of the task switches. It is the major problem of the similar, single core, multipurpose, wonderful servers. But in your case it caused a heavy system load, which doesn't happen.

I were happy to extend this answer if you would explain, where is this "30% idle" thing is coming from. You say, the system load is always over 1.