Ext3 fsck time versus partition size

ext3fsck

I'm doing the setup for a large-scale storage farm, and to avoid the need for month-long fscks, my plan is to split the storage into numerous smaller filesystems (this is fine, as I have a well-bucketed file tree, so I can easily have separate filesystems mounted on 1/, 2/, 3/, 4/, etc).

My difficulty is in finding any enumeration of what a "reasonable" size is for a filesystem, to keep fsck times similarly "reasonable". Whilst I'm fully aware that absolute time for a given size will depend largely on hardware, I can't seem to find any description of the shape of the curve for ext3 fsck times with varying filesystem sizes, and what the other variables are (does a filesystem full of files in a single directory take longer than one with 10 files in each of thousands of directories in a tree; large files vs small files; full filesystem vs empty filesystem; and so on).

Does anyone have references to any well-researched numbers about this? Failing that, any anecdotes about these issues should at least help to guide my own experimentation, should that be required.

EDIT: To clarify: regardless of the filesystem, if something goes wrong with the metadata, it will need to be checked. Whether time- or mount-based re-fscks are enabled or needed is not at issue, and the only reason I'm asking for numbers specifically regarding ext3 is because that's the most likely filesystem to be chosen. If you know of a filesystem that has a particularly fast fsck process, I'm open to suggestions, but it does need to be a robust option (claims that "filesystem X never needs fscking!" will be laughed at and derided at length). I am also aware of the need for backups, and the desire to fsck is not a substitute for backups, however just discarding the filesystem and restoring from backup when it glitches, rather than fscking it, seems like a really, really dumb tradeoff.

Best Answer

According to a paper by Mathur et al. (p. 29), e2fsck time grows linearly with the amount of inodes on a filesystem after a certain point. If the graph is anything to go by, you're more effective with filesystems of up to 10 million inodes.

Switching to ext4 would help - under the condition that your filesystem is not loaded to the brim, where the performance gain (due to not checking inodes marked unused) has no discernable effect.

Related Topic