Can ext4 fs be completely unrecoverable broken due to power loss while disk is writing

ext4

Suppose you are doing a full speed write to disk in a linux pc box with ssd or mechanical disk (OS also on the same disk, there is no battery/UPS):

cat /dev/urandom > omg.txt

If the power losses suddenly during the process, or any other sort of ungracefully shutdown/reset.

Will the file be corrupted and unable to fix (i.e. none of any data can be recovered?), will there be a chance the file system be completely unable to boot?

Best Answer

Will the file be corrupted and unable to fix (i.e. none of any data can be recovered?)

Potentially, yes. There's 2 obvious routes via which this could happen.

Ext4 is a metadata journaling filesystem - it only journals the changes to the file's meta data (size, location, dates) - not the file contents (btrfs and zfs do full-data journalling at a big performance cost). So although you should never have to fsck the disk, it doesn't follow that every write operation betwen opening the file and closing + flushing the buffers will have completed. There is no transactional control over writes to the file data.

A second possibility is that the disk may be physically damaged by power spikes. Although the rest of the hardware tends to do a good job of isolating the hard disk, there will still be some leakage.

will there be a chance the file system be completely unable to boot?

That's a very different question - this is a lot less likely. Certainly the first scenario only applies if you happen to be writing the kernel, bootloader, ramdisk etc at the time of the outage.

Related Solutions

Best way to ‘harden’ embedded ext4 file server against unexpected loss of power

Mounting the filesystem with sync specified in fstab would probably help. I suspect someone will have a recommendation better suited for your particular application.

I begun initial research on filesystems used with flash storage, as I want to custom-build a home theater PC as an appliance. You may find a different storage solution better suited for your device. Unfortunately, I have yet to find something I prefer so I do not have a detailed recommendation there.

Edit 1

According to the smb.conf(5) manpage, it supports immediate syncing within SAMBA:

   strict sync (S)
          Many Windows applications (including the Windows 98
          explorer  shell)  seem  to  confuse flushing buffer
          contents to disk with doing a sync to  disk.  Under
          UNIX,  a  sync  call  forces the process to be sus-
          pended until the kernel has ensured that  all  out-
          standing  data  in  kernel  disk  buffers  has been
          safely stored onto stable  storage.  This  is  very
          slow  and  should only be done rarely. Setting this
          parameter to no (the default)  means  that  smbd(8)
          ignores  the  Windows  applications  requests for a
          sync call. There is only a  possibility  of  losing
          data  if  the operating system itself that Samba is
          running on crashes, so there is  little  danger  in
          this  default setting. In addition, this fixes many
          performance problems that people have reported with
          the new Windows98 explorer shell file copies.

          Default: strict sync = no

   sync always (S)
          This  is  a boolean parameter that controls whether
          writes will always be  written  to  stable  storage
          before  the  write call returns. If this is no then
          the server will be guided by the  client's  request
          in  each write call (clients can set a bit indicat-
          ing that a particular write should be synchronous).
          If this is yes then every write will be followed by
          a fsync()  call to ensure the data  is  written  to
          disk.  Note  that the strict sync parameter must be
          set to yes in order for this parameter to have  any
          affect.

          Default: sync always = no

How to Protect SSD from Corruption Due to Power Loss

When suddenly losing power, MLC/TLC/QLC SSDs have two failure modes:

they lose the in-flight and in-DRAM-only writes;
they can corrupt any data-at-rest stored in the lower page of the NAND cell being programmed.

The first failure condition is obvious: without power protection, any data which are not on stable storage (ie: NAND itself) but on volatile cache only (DRAM) will be lost. The same happens with classical mechanical disks (and that alone can wreak havoc on filesystem which does not properly issue fsyncs).

The second failure condition is a MLC+ SSDs affair: when reprogramming the high page bit for storing new data, an unexpected power loss can destroy/alter the lower bit (ie: previous committed data) also.

The only true, and most obvious, solution is to integrate a power-loss-protected DRAM cache (generally using battery/supercaps), as done since forever by high-end RAID controllers; this, however, increase drive cost/price. Consumer drives typically have no power-loss-protected caches; rather, they use an array of more economical solutions as:

partially protected write cache (ie: Crucial M500/M550/M600+);
NAND changes journal (ie: Samsung drives, see SMART PoR attribute);
special SLC/pseudo-SLC NAND regions to absorbe new writes without previous data at risk (ie: Sandisk, Samsung, etc).

Back to your question: your Kingstone drives are ultra-cheap ones, using unspecified controller and basically no public specs. It does not surprise me that a sudden power loss corrupted previous data. Unfortunately, even disabling the disk's DRAM cache (with the massive performance loss it commands) will not solve your problem, as previous data (ie: data-at-rest) can, and will, be corrupted by unexptected power losses. If they are based on the old Sandforce controller, even a total drive brick can be expected under the "right" circumstances.

I strongly suggest to review your UPS and, in the mid-term, to replace these aging drives.

A last note about PostgreSQL and other Linux databases: they will not disable the disk's cache and should not be exptected to do that. Rather, they isses periodic/required fsyncs/FUAs to commit key data to stable storage. This is the way things should be done unless a very compelling reason exists (ie: a drive which lies about ATA FLUSHES/FUAs).

EDIT: if possible, consider migrating to a checksumming filesystem as ZFS or BTRFS. At the very least consider XFS, which has journal checksum and, lately, even metadata checksum. If you are forced to use EXT4, consider enabling auto-fsck at startup (fsck.ext4 is very good at repair corruption).

Best Answer

Related Solutions

Best way to ‘harden’ embedded ext4 file server against unexpected loss of power

How to Protect SSD from Corruption Due to Power Loss

Related Topic