Linux – “postgres blocked for more than 120 seconds” – is the db still consistent

kernellinuxpostgresqlvirtualization

I am using an iscsi volume on an Open-E storage system for several virtual machines running on a XenServer host. Occasionally, when there is a very high disk I/O load on the virtual machines (and therefore also on the storage system), I got this error message on the vm consoles:

[2594520.161701] INFO: task kjournald:117 blocked for more than 120 seconds.
[2594520.161787] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[2594520.162194] INFO: task flush-202:0:229 blocked for more than 120 seconds.
[2594520.162274] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[2594520.162801] INFO: task postgres:1567 blocked for more than 120 seconds.
[2594520.162882] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.

I understand this error message is caused by the kernel to inform that these processes haven't been run for 120 seconds, most likely because a disk access to the storage system has not yet been processed.

But what is the effect on the processes. For example, will the postgres process eventually write its data when the storage system is idle again after a few minutes, so that all data is still consistent? Or will it abort the write, leaving some tables in an inconsistent state?

I certainly expect that the former should be the case – if the disk access is slow, postgres (or any other affected process) should just wait as long as it takes. I can live with the application hanging for a few minutes. But if there is a chance for data corruption then any of these errors is really bad news.

Please advise what to do here.

Best Answer

Your intuition that the DB would stay consistent should be correct, unless the reason for the 120-second hangs happens to be the disk itself failing. If the root cause really is just high I/O, PostgreSQL will ensure that the order it commits data to disk will ensure it's not corrupt.

I've had situations before where SATA disks failing tend to hang waiting for I/O operations to complete and result in this kernel error. By the time that happens, you probably can't trust the data on that disk very much - the 120-second hang is merely a side-effect rather than the root cause of the corruption.