Linux – Huge amounts of “multiply-claimed blocks” during fsck

fscklinuxUbuntu

The basic question:

How long should fsck take to fix a 100GB (17 million block) file with multiply-claimed blocks?

The long version of the question:

After a UPS failure, I am faced with an Ubuntu 10.04 server which dropped into fsck on initial boot. This is normal, buy usually about half an hour of fixing the various problems by agreeing to the prompts is enough to get the server back.

Not today, though. Today, I got a huge list of numbers scroll past the console matrix-style for a good few minutes. It was basically line after line of:

Multiply-claimed blocks in inode xxxxxxxxx

Anyway, after a few minutes of those scrolling past, it finally settled down and I got:

Pass 1C: Scanning directories for inodes with multiply-claimed blocks

followed by…

Pass 1D: Reconciling multiply-claimed blocks

..and..

(There are 32 inodes containing multiply-claimed blocks.)

That didn't sound so bad, but then it started going through some files as so:

File /path/to/a/file

has 1 multiply-claimed block(s) shared with 1 file(s):

/path/to/another/file

Clone multiply-claimed blocks? yes

This question was answered for me and the process continued. However, it took a very very long time. Hours and hours even though it was only a 2MB file.

After that, a similar dialogue appeared but this time for a virtual machine image file which is 100GB and reported as being over 17 Million multiple-claimed blocks, shared with 0 file(s).

That was 2 days ago and it's still running now.

So, back to my original question, how long should this take? Is it a lost cause and are there any alternative ways to deal with this? What I really don't understand is why the 100GB file is reported as being shared with 0 files which is a contradiction if I understand the meaning of multiply-claimed blocks correctly.

Best Answer

How long it takes would depend on disk subsystem performance, the damage being repaired, etc.

It sounds like there's some decent filesystem corruption. How big is the actual filesystem? You said it's a 100 GB file and later that it's a VM image? Is this a VM server? Or are you talking about virtualbox?

Personally if it took over a day and the damage was definitely to one file, I'd restore the file from backup and if there were any indications of continuing issues, reformat and restore from backup, assuming the drive isn't coincidentally failing. I have trust issues with filesystems that start to go bad. If the drive itself isn't failing, the filesystem may have a pervasive problem until it's started fresh.

But that's me.