Linux – Monitoring XFS filesystem health in Linux

fsckhealthlinuxxfs

I recently experienced a filesystem meltdown. I had a server running for about 180 days non stop without any issues, but then I noticed weird stuff happen and apparently the ext3 filesystem was in really bad shape. I had the drives and the memory tested and they were all fine. Ultimately, I was forced to hose the system and do a full reinstall. fsck.ext3 only made things worse.

Now, I don't want this to happen again so this time I went with XFS instead, which I feel is more mature than ext3, but I am at a loss how to monitor the health of the file system. xfs_check simply won't let me scan the device while it is mounted.

So, how do you monitor the health of an XFS filesystem while the system is online?

Best Answer

Truthfully there isn't much you can do to monitor the operational health of the filesystem itself. This thread explains the reasons why you can't perform an fsck-style check on a filesystem which is online as read/write.

In part, you should trust that as a journalling filesystem, XFS is doing it's best to keep your data in good health. You may also take some solace in knowing that xfs_check is much faster than fsck.ext3 and XFS doesn't stipulate a periodic checks in the same way as ext3's 180 day / x mounts rule.


Edit to comments:

While I understand that you're once bitten, twice shy. I can assure you that "complete meltdown" isn't a systematic issue associated with UNIX filesystems. In my experience such events tend only to materialise in hand with hardware failure, user error (no disrespect intended), or an unfortunate mixture of both. However this is kind of hard to reason with you on a technical level without some very specific details of what went wrong with your previous ext3 install.