How to deal with bad sectors

hard drive

Bad sectors will eventually occur, but how should I deal with them? If a bad sector occurs, does that mean that the data in that sector is irrecoverably lost, and I should restore it from backup? Is there any way to automate finding out which file belonged to that sector and at which offset, and to automate that recovery? Is there anything I can do on the filesystem level to make my life easier? (ECC?)

Best Answer

You do not deal with bad sectors. Your hardware, server configuration, and internal procedures protect you from their effects.

  • Every modern hard drive anticipates a certain amount of bad sectors, and internally remaps them. This process is completely transparent to the user/OS, until such time as the remapping space is all used up (at which point you start seeing bad sectors).
    Long before you see bad sectors your drive will start crying - SMART or equivalent technology causes the drive to report faults to the operating system (which you are of course monitoring for, right?).

  • If you love your data (and who doesn't) then you don't just trust it to one hard drive.
    All your important data is on RAID volumes (hardware or software - makes no difference for the purposes of this discussion).
    RAID gives you two or more redundant hard drives, so that when one disk fails you have the opportunity to replace it without losing any data.

  • Because you know that RAID Is Not A Backup, you also make regular backups (and periodically verify that you can restore them successfully), so that even if you lose enough drives that your RAID array is trashed you can still get your data back.


As with all good strategies, this is Defense In Depth:
The hard drives do their best to safeguard your data by handling errors/bad sectors gracefully.
Should the hard drive fail, RAID keeps your data safe until you can fix the hardware problem.
If the RAID fails to protect you your backups are a final chance to save your data.

Ideally you use all of these techniques all of the time (at least for important data), but you always have at least one layer of the onion (even laptop hard drives are S.M.A.R.T. these days).

Related Topic