My guess is that there's some other process that hogs the disk I/O capacity for a while. iotop
can help you pinpoint it, if you have a recent enough kernel.
If this is the case, it's not about the filesystem, much less about journalling. It's the I/O scheduler the responsible to arbitrate between conflicting applications. An easy test: check the current scheduler and try a different one. It can be done on the fly, without restarting. For example, on my desktop to check the first disk (/dev/sda
):
cat /sys/block/sda/queue/scheduler
=> noop deadline [cfq]
shows that it's using CFQ, which is a good choice for desktops but not so much for servers. Better set 'deadline':
echo 'deadline' > /sys/block/sda/queue/scheduler
cat /sys/block/sda/queue/scheduler
=> noop [deadline] cfq
and wait a few hours to see if it improves. If so, set it permanently in the startup scripts (depends on distribution)
Install one system, boot it and check out the block layer statistics from /sys/block/${DEV}/stat
e.g. /sys/block/sda/stat
.
Quoting from the documentation:
The stat file consists of a single line of text containing 11 decimal values separated by whitespace. The fields are summarized in the following table, and described in more detail below:
Name units description
---- ----- -----------
read I/Os requests number of read I/Os processed
read merges requests number of read I/Os merged with in-queue I/O
read sectors sectors number of sectors read
read ticks milliseconds total wait time for read requests
write I/Os requests number of write I/Os processed
write merges requests number of write I/Os merged with in-queue I/O
write sectors sectors number of sectors written
write ticks milliseconds total wait time for write requests
in_flight requests number of I/Os currently in flight
io_ticks milliseconds total time this block device has been active
time_in_queue milliseconds total wait time for all requests
read sectors, write sectors
These values count the number of sectors read from or written to this block device. The "sectors" in question are the standard UNIX 512-byte sectors, not any device- or filesystem-specific block size. The counters are incremented when the I/O completes.
You can use this one-liner to get the number of bytes more easily:
awk '{printf("read %d bytes, wrote %d bytes\n", $3*512, $7*512)}' /sys/block/vda/stat
Results for Scientific Linux 6.1 i386
I tested this on a KVM/qemu virtual machine running Scientific Linux 6.1 i386 (which is similar to RHEL). The following services were enabled: acpid, auditd, crond, network, postfix, rsyslog, sshd and udev-post. The swap is on a separate disk, so it's not taken into account.
The stats for 85 boots, taken remotely with SSH a couple of seconds after the login prompt appeared, were:
Name Median Average Stdev
------------- ------ ------- -----
read I/Os 1920 1920.2 2.6
read merges 1158 1158.4 1.8
read sectors 85322 85330.9 31.9
>> read MiBytes 41.661 41.665 0.016
read ticks 1165 1177.2 94.1
write I/Os 33 32.6 1.7
write merges 64 59.6 7.4
write sectors 762 715.2 70.9
>> write MiBytes 0.372 0.349 0.035
write ticks 51 59.0 17.4
in_flight 0 0.0 0.0
io_ticks 895 909.9 57.8
time_in_queue 1217 1235.2 98.5
The boot time was around 20 seconds.
Best Answer
Is this a one time thing, or is this information you want to be able to extract regularly? In case it is the later then one option is to apply quotas on your filesystem. Doing that the system continuously keeps track of the amount of data used by each user. That way the information is merely a query to the quota database away.