Tracking down an elusive and slow anon_inode from lsof and Apache strace

apache-2.2lsofstrace

Experiencing an intermittent issue with a LAMP application wherein Apache forks to its ServerLimit and grinds to a (near) halt. An strace on any httpd process shows numerous, slow epoll_wait calls.

1.254721 epoll_wait(14, {{EPOLLIN, ...
3.296430 epoll_wait(14, {{EPOLLIN, ...
1.018047 epoll_wait(14, {{EPOLLIN, ...
1.279721 epoll_wait(14, {{EPOLLIN, ...
1.145649 epoll_wait(14, {{EPOLLIN, ...
1.269836 epoll_wait(14, {{EPOLLIN, ...
1.094779 epoll_wait(14, {{EPOLLIN, ...
1.205911 epoll_wait(14, {{EPOLLIN, ...
9.052785 epoll_wait(14, {{EPOLLIN, ...
1.116279 epoll_wait(14, {{EPOLLIN, ...
1.027709 epoll_wait(14, {{EPOLLIN, ...
1.178679 epoll_wait(14, {{EPOLLIN, ...
1.336032 epoll_wait(14, {{EPOLLIN, ...
2.541861 epoll_wait(14, {{EPOLLIN, ...
1.113012 epoll_wait(14, {{EPOLLIN, ...

An lsof on the same process claims this is an anon_inode:

COMMAND  PID   USER   FD   TYPE   DEVICE      SIZE     NODE NAME
httpd   9709 apache   14u  0000      0,7         0      373 anon_inode

Any insight as to what that could be, or advice to track down that information?

Best Answer

This is fundamentally lsof not reporting as well as it should. I can reproduce this on Ubuntu 14.04, but in a newer Ubuntu I see:

perl 511299 frew 3u a_inode 0,11 0 9666 [eventpoll]

Which is pretty clear. Basically, that epoll file descriptor will always look slow because it is literally the kernel blocking on either a timeout or events on a number of other file descriptors. See epoll(7) for more details.