Linux – Major issues with fsck of 10TB ext3 RAID 6 (memory allocation failed, etc.)

ext3fscklinuxmemorysoftware-raid

I recently added a 7th 2TB drive to a linux md software RAID 6 setup. After md finished reshaping the array from 6 to 7 drives (from 8 to 10TB), I was still able to mount the file system without problems. In preparation for resize2fs, I then unmounted the partition and ran fsck -Cfyv and was greeted with an endless stream of millions of random errors. Here is a short excerpt:

Pass 1: Checking inodes, blocks, and sizes
Inode 4193823 is too big.  Truncate? yes
Block #1 (748971705) causes symlink to be too big.  CLEARED.
Block #2 (1076864997) causes symlink to be too big.  CLEARED.
Block #3 (172764063) causes symlink to be too big.  CLEARED.
...
Inode 4271831 has a extra size (39949) which is invalid Fix? yes
Inode 4271831 is in use, but has dtime set.  Fix? yes
Inode 4271831 has imagic flag set.  Clear? yes
Inode 4271831 has a extra size (8723) which is invalid Fix? yes
Inode 4271831 has EXTENTS_FL flag set on filesystem without extents support. Clear? yes
...
Inode 4427371 has compression flag set on filesystem without compression support. Clear? yes
Inode 4427371 has a bad extended attribute block 1242363527.  Clear? yes
Inode 4427371 has INDEX_FL flag set but is not a directory. Clear HTree index? yes
Inode 4427371, i_size is 7582975773853056983, should be 0.  Fix? yes
...
Inode 4556567, i_blocks is 5120, should be 5184.  Fix? yes
Inode 4566900, i_blocks is 5160, should be 5200.  Fix? yes
...
Inode 5628285 has illegal block(s).  Clear? yes
Illegal block #0 (4216391480) in inode 5628285.  CLEARED.
Illegal block #1 (2738385218) in inode 5628285.  CLEARED.
Illegal block #2 (2576491528) in inode 5628285.  CLEARED.
...
Illegal indirect block (2281966716) in inode 5628285.  CLEARED.
Illegal double indirect block (2578476333) in inode 5628285.  CLEARED.
Illegal block #477119515 (3531691799) in inode 5628285.  CLEARED.

Compression? Extents? I've never had ext4 anywhere near this machine!

Now, the problem is that fsck keeps dying with the following error message:

Error storing directory block information (inode=5628285, block=0, num=316775570): Memory allocation failed

At first I was able to simply re-run fsck and it would die at a different inode, but now it's settled on 5628285 and I can't get it to go beyond that.

I've spent the last days trying to search for fixes to this and found the following 3 "solutions":

  • Use 64-bit linux. /proc/cpuinfo contains lm as one of the processor flags, getconf LONG_BIT returns 64 and uname -a has this to say:
    Linux <servername> 3.2.0-4-amd64 #1 SMP Debian 3.2.46-1 x86_64 GNU/Linux. Should be all good, no?
  • Add [scratch_files] / directory = /var/cache/e2fsck to /etc/e2fsck.conf. Did that and every time I re-run fsck, it adds another 500K *-dirinfo-* and an 8M *-icount-* file to the /var/cache/e2fsck directory. So that seems to have its desired effect as well.
  • Add more memory or swap space to the machine. 12GB of RAM and a 32GB swap partition should be sufficient, no?

Needless to say: Nothing helped, otherwise I wouldn't be writing here.

Naturally, now the drive is marked bad and I can't mount it any more. So, as of right now, I lost 8TB of data due to a disk-check?!?!?

This leaves me with 3 questions:

  • Is there anything I can do to fix this drive (remember, everything was fine before I ran fsck!) other than spending a month to learn the ext3 disk format and then trying to fix it manually with a hex editor???
  • How is it possible, that something as mission-critical as fsck for a file-system as popular as ext3 still has issues like this??? Especially since ext3 is over a decade old.
  • Is there an alternative to ext3 that doesn't have these sorts of fundamental reliability issues? Maybe jfs?

(I'm using e2fsck 1.42.5 on 64-bit Debian Wheezy 7.1 now, but had the same issues with an earlier version on 32-bit Debian Squeeze)

Best Answer

Just rebuild the array and restore the data from a backup. The whole point of RAID is to minimize downtime. By messing around and trying to fix a problem like this, you just increase your downtime defeating the whole purpose of RAID. RAID doesn't protect against data loss, it protects against downtime.