Failing drive or can it be salvaged

hard drive

Today all of a sudden I got this in the kernel log (all was fine until that point):

Sep  5 03:19:12 foo kernel: [36337581.601013] ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Sep  5 03:19:12 foo kernel: [36337581.601919] ata3.00: failed command: FLUSH CACHE EXT
Sep  5 03:19:12 foo kernel: [36337581.602738] ata3.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 8
Sep  5 03:19:12 foo kernel: [36337581.602738]          res 40/00:00:00:4f:c2/00:00:00:00:00/40 Emask 0x4 (timeout)
Sep  5 03:19:12 foo kernel: [36337581.604367] ata3.00: status: { DRDY }
Sep  5 03:19:12 foo kernel: [36337581.605232] ata3: hard resetting link
Sep  5 03:19:17 foo kernel: [36337586.956878] ata3: link is slow to respond, please be patient (ready=0)
Sep  5 03:19:22 foo kernel: [36337591.636770] ata3: COMRESET failed (errno=-16)
Sep  5 03:19:22 foo kernel: [36337591.637740] ata3: hard resetting link
Sep  5 03:19:28 foo kernel: [36337596.988651] ata3: link is slow to respond, please be patient (ready=0)
Sep  5 03:19:32 foo kernel: [36337601.668524] ata3: COMRESET failed (errno=-16)
Sep  5 03:19:32 foo kernel: [36337601.669508] ata3: hard resetting link
Sep  5 03:19:38 foo kernel: [36337607.024354] ata3: link is slow to respond, please be patient (ready=0)
Sep  5 03:19:49 foo kernel: [36337618.544137] ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Sep  5 03:19:50 foo kernel: [36337619.953761] ata3.00: configured for UDMA/133
Sep  5 03:19:50 foo kernel: [36337619.953767] ata3.00: retrying FLUSH 0xea Emask 0x4
Sep  5 03:19:50 foo kernel: [36337619.953990] ata3: EH complete
Sep  5 03:28:14 foo kernel: [36338123.536034] ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Sep  5 03:28:14 foo kernel: [36338123.536999] ata3.00: failed command: FLUSH CACHE EXT
Sep  5 03:28:14 foo kernel: [36338123.537934] ata3.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 0
Sep  5 03:28:14 foo kernel: [36338123.537934]          res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Sep  5 03:28:14 foo kernel: [36338123.539824] ata3.00: status: { DRDY }
Sep  5 03:28:14 foo kernel: [36338123.540820] ata3: hard resetting link
Sep  5 03:28:19 foo kernel: [36338128.895910] ata3: link is slow to respond, please be patient (ready=0)
Sep  5 03:28:24 foo kernel: [36338133.575787] ata3: COMRESET failed (errno=-16)
Sep  5 03:28:24 foo kernel: [36338133.576797] ata3: hard resetting link
Sep  5 03:28:29 foo kernel: [36338138.927669] ata3: link is slow to respond, please be patient (ready=0)
Sep  5 03:28:34 foo kernel: [36338143.607555] ata3: COMRESET failed (errno=-16)
Sep  5 03:28:34 foo kernel: [36338143.608599] ata3: hard resetting link
Sep  5 03:28:39 foo kernel: [36338148.963407] ata3: link is slow to respond, please be patient (ready=0)
Sep  5 03:29:09 foo kernel: [36338178.610690] ata3: COMRESET failed (errno=-16)
Sep  5 03:29:09 foo kernel: [36338178.611800] ata3: limiting SATA link speed to 1.5 Gbps
Sep  5 03:29:09 foo kernel: [36338178.611803] ata3: hard resetting link
Sep  5 03:29:14 foo kernel: [36338183.662571] ata3: COMRESET failed (errno=-16)
Sep  5 03:29:14 foo kernel: [36338183.663665] ata3: reset failed, giving up
Sep  5 03:29:14 foo kernel: [36338183.664680] ata3.00: disabled
Sep  5 03:29:14 foo kernel: [36338183.664702] ata3: EH complete
Sep  5 03:29:14 foo kernel: [36338183.664746] sd 2:0:0:0: [sdc] tag#1 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Sep  5 03:29:14 foo kernel: [36338183.664754] sd 2:0:0:0: [sdc] tag#1 CDB: Synchronize Cache(10) 35 00 00 00 00 00 00 00 00 00
Sep  5 03:29:14 foo kernel: [36338183.664762] print_req_error: I/O error, dev sdc, sector 1950632856
Sep  5 03:29:14 foo kernel: [36338183.665854] Aborting journal on device sdc1-8.
Sep  5 03:29:14 foo kernel: [36338183.665936] sd 2:0:0:0: [sdc] tag#9 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Sep  5 03:29:14 foo kernel: [36338183.665942] sd 2:0:0:0: [sdc] tag#9 CDB: Write(10) 2a 00 50 f0 39 b0 00 00 08 00
Sep  5 03:29:14 foo kernel: [36338183.665945] print_req_error: I/O error, dev sdc, sector 1357920688
Sep  5 03:29:14 foo kernel: [36338183.665953] EXT4-fs warning (device sdc1): ext4_end_bio:323: I/O error 10 writing to inode 39321953 (offset 0 size 4096 starting block 169740087)
Sep  5 03:29:14 foo kernel: [36338183.665958] sd 2:0:0:0: [sdc] tag#4 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Sep  5 03:29:14 foo kernel: [36338183.665961] Buffer I/O error on device sdc1, logical block 169739830
Sep  5 03:29:14 foo kernel: [36338183.665968] sd 2:0:0:0: [sdc] tag#4 CDB: Write(10) 2a 00 b0 68 ca 00 00 00 08 00
Sep  5 03:29:14 foo kernel: [36338183.665971] print_req_error: I/O error, dev sdc, sector 2959657472
Sep  5 03:29:14 foo kernel: [36338183.665980] EXT4-fs warning (device sdc1): ext4_end_bio:323: I/O error 10 writing to inode 39325770 (offset 0 size 0 starting block 369957185)
Sep  5 03:29:14 foo kernel: [36338183.665985] Buffer I/O error on device sdc1, logical block 369956928
Sep  5 03:29:14 foo kernel: [36338183.666021] sd 2:0:0:0: [sdc] tag#3 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Sep  5 03:29:14 foo kernel: [36338183.666023] sd 2:0:0:0: [sdc] tag#6 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Sep  5 03:29:14 foo kernel: [36338183.666027] sd 2:0:0:0: [sdc] tag#3 CDB: Write(10) 2a 00 af e4 94 e0 00 00 08 00
Sep  5 03:29:14 foo kernel: [36338183.666028] sd 2:0:0:0: [sdc] tag#6 CDB: Write(10) 2a 00 b6 3d 5b f8 00 00 08 00
Sep  5 03:29:14 foo kernel: [36338183.666032] print_req_error: I/O error, dev sdc, sector 2950993120
Sep  5 03:29:14 foo kernel: [36338183.666034] sd 2:0:0:0: [sdc] tag#2 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Sep  5 03:29:14 foo kernel: [36338183.666037] print_req_error: I/O error, dev sdc, sector 3057474552
Sep  5 03:29:14 foo kernel: [36338183.666042] EXT4-fs warning (device sdc1): ext4_end_bio:323: I/O error 10 writing to inode 39583803 (offset 0 size 0 starting block 368874141)
Sep  5 03:29:14 foo kernel: [36338183.666044] sd 2:0:0:0: [sdc] tag#2 CDB: Write(10) 2a 00 ae 63 93 90 00 00 08 00
Sep  5 03:29:14 foo kernel: [36338183.666049] EXT4-fs warning (device sdc1): ext4_end_bio:323: I/O error 10 writing to inode 40108080 (offset 0 size 0 starting block 382184320)
Sep  5 03:29:14 foo kernel: [36338183.666051] Buffer I/O error on device sdc1, logical block 368873884
Sep  5 03:29:14 foo kernel: [36338183.666054] print_req_error: I/O error, dev sdc, sector 2925761424
Sep  5 03:29:14 foo kernel: [36338183.666057] Buffer I/O error on device sdc1, logical block 382184063
Sep  5 03:29:14 foo kernel: [36338183.666061] EXT4-fs warning (device sdc1): ext4_end_bio:323: I/O error 10 writing to inode 39845943 (offset 0 size 0 starting block 365720179)
Sep  5 03:29:14 foo kernel: [36338183.666066] Buffer I/O error on device sdc1, logical block 365719922
Sep  5 03:29:14 foo kernel: [36338183.666078] sd 2:0:0:0: [sdc] tag#5 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Sep  5 03:29:14 foo kernel: [36338183.666079] sd 2:0:0:0: [sdc] tag#8 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Sep  5 03:29:14 foo kernel: [36338183.666082] sd 2:0:0:0: [sdc] tag#5 CDB: Write(10) 2a 00 b1 38 bb 78 00 00 10 00
Sep  5 03:29:14 foo kernel: [36338183.666084] sd 2:0:0:0: [sdc] tag#8 CDB: Write(10) 2a 00 50 85 8f c8 00 00 08 00
Sep  5 03:29:14 foo kernel: [36338183.666085] print_req_error: I/O error, dev sdc, sector 2973285240
Sep  5 03:29:14 foo kernel: [36338183.666088] print_req_error: I/O error, dev sdc, sector 1350930376
Sep  5 03:29:14 foo kernel: [36338183.666090] sd 2:0:0:0: [sdc] tag#12 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Sep  5 03:29:14 foo kernel: [36338183.666095] EXT4-fs warning (device sdc1): ext4_end_bio:323: I/O error 10 writing to inode 41287759 (offset 0 size 0 starting block 371660656)
Sep  5 03:29:14 foo kernel: [36338183.666098] EXT4-fs warning (device sdc1): ext4_end_bio:323: I/O error 10 writing to inode 39714837 (offset 0 size 0 starting block 168866298)
Sep  5 03:29:14 foo kernel: [36338183.666101] sd 2:0:0:0: [sdc] tag#12 CDB: Write(10) 2a 00 60 cc 11 58 00 00 10 00
Sep  5 03:29:14 foo kernel: [36338183.666103] Buffer I/O error on device sdc1, logical block 371660399
Sep  5 03:29:14 foo kernel: [36338183.666105] Buffer I/O error on device sdc1, logical block 168866041
Sep  5 03:29:14 foo kernel: [36338183.666107] print_req_error: I/O error, dev sdc, sector 1623986520
Sep  5 03:29:14 foo kernel: [36338183.666112] EXT4-fs warning (device sdc1): ext4_end_bio:323: I/O error 10 writing to inode 41287759 (offset 3866624 size 4096 starting block 371660657)
Sep  5 03:29:14 foo kernel: [36338183.666116] EXT4-fs warning (device sdc1): ext4_end_bio:323: I/O error 10 writing to inode 38535200 (offset 0 size 0 starting block 202998316)
Sep  5 03:29:14 foo kernel: [36338183.666118] Buffer I/O error on device sdc1, logical block 371660400
Sep  5 03:29:14 foo kernel: [36338183.666120] Buffer I/O error on device sdc1, logical block 202998059
Sep  5 03:29:14 foo kernel: [36338183.666122] sd 2:0:0:0: [sdc] tag#10 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Sep  5 03:29:14 foo kernel: [36338183.666128] EXT4-fs warning (device sdc1): ext4_end_bio:323: I/O error 10 writing to inode 38535200 (offset 442368 size 4096 starting block 202998317)
Sep  5 03:29:14 foo kernel: [36338183.666131] sd 2:0:0:0: [sdc] tag#10 CDB: Write(10) 2a 00 53 a8 6e b0 00 00 08 00
Sep  5 03:29:14 foo kernel: [36338183.666133] print_req_error: I/O error, dev sdc, sector 1350504016
Sep  5 03:29:14 foo kernel: [36338183.666136] Buffer I/O error on device sdc1, logical block 202998060
Sep  5 03:29:14 foo kernel: [36338183.683800] Buffer I/O error on dev sdc1, logical block 243826688, lost sync page write
Sep  5 03:29:14 foo kernel: [36338183.684426] JBD2: Error -5 detected when updating journal superblock for sdc1-8.
Sep  5 03:29:14 foo kernel: [36338183.685103] JBD2: Detected IO errors while flushing file data on sdc1-8
Sep  5 03:29:14 foo kernel: [36338183.686102] Buffer I/O error on dev sdc1, logical block 0, lost sync page write
Sep  5 03:29:14 foo kernel: [36338183.687536] EXT4-fs error (device sdc1): ext4_journal_check_start:61: Detected aborted journal
Sep  5 03:29:14 foo kernel: [36338183.688927] EXT4-fs (sdc1): Remounting filesystem read-only
Sep  5 03:29:14 foo kernel: [36338183.690246] EXT4-fs (sdc1): previous I/O error to superblock detected
Sep  5 03:29:14 foo kernel: [36338183.691754] Buffer I/O error on dev sdc1, logical block 0, lost sync page write
Sep  5 03:30:18 foo kernel: [36338247.094817] scsi_io_completion: 111 callbacks suppressed
Sep  5 03:30:18 foo kernel: [36338247.094824] sd 2:0:0:0: [sdc] tag#30 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Sep  5 03:30:18 foo kernel: [36338247.094830] sd 2:0:0:0: [sdc] tag#30 CDB: Read(10) 28 00 4b 85 03 58 00 00 08 00
Sep  5 03:30:18 foo kernel: [36338247.094831] print_req_error: 113 callbacks suppressed
Sep  5 03:30:18 foo kernel: [36338247.094833] print_req_error: I/O error, dev sdc, sector 1267008344
Sep  5 03:30:18 foo kernel: [36338247.096047] sd 2:0:0:0: [sdc] tag#0 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Sep  5 03:30:18 foo kernel: [36338247.096050] sd 2:0:0:0: [sdc] tag#0 CDB: Read(10) 28 00 4b 85 03 58 00 00 08 00
Sep  5 03:30:18 foo kernel: [36338247.096052] print_req_error: I/O error, dev sdc, sector 1267008344
Sep  5 03:30:19 foo kernel: [36338248.410203] sd 2:0:0:0: [sdc] tag#1 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Sep  5 03:30:19 foo kernel: [36338248.410208] sd 2:0:0:0: [sdc] tag#1 CDB: Read(10) 28 00 4b c4 08 08 00 00 08 00
Sep  5 03:30:19 foo kernel: [36338248.410210] print_req_error: I/O error, dev sdc, sector 1271138312
Sep  5 03:30:19 foo kernel: [36338248.411396] sd 2:0:0:0: [sdc] tag#2 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Sep  5 03:30:19 foo kernel: [36338248.411401] sd 2:0:0:0: [sdc] tag#2 CDB: Read(10) 28 00 4b c4 08 08 00 00 08 00
Sep  5 03:30:19 foo kernel: [36338248.411404] print_req_error: I/O error, dev sdc, sector 1271138312
Sep  5 03:30:19 foo kernel: [36338248.414145] sd 2:0:0:0: [sdc] tag#3 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Sep  5 03:30:19 foo kernel: [36338248.414149] sd 2:0:0:0: [sdc] tag#3 CDB: Read(10) 28 00 4e 44 18 b0 00 00 08 00
Sep  5 03:30:19 foo kernel: [36338248.414152] print_req_error: I/O error, dev sdc, sector 1313085616
Sep  5 03:30:19 foo kernel: [36338248.415072] sd 2:0:0:0: [sdc] tag#4 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Sep  5 03:30:19 foo kernel: [36338248.415077] sd 2:0:0:0: [sdc] tag#4 CDB: Read(10) 28 00 4e 44 18 b0 00 00 08 00

… which kept repeating.

The mount folder and its subdirectories were still accessible, but not all of them, and not in all their depth. But still, that was some sign of life so … encouraging.

The drive is a Seagate 2TB SSHD 2.5" – this kind – which I did kind of thrash lately with small files, which is why I got the SSHD version, thinking the internal small SSD buffer helps. Nevertheless …

I then unmounted /dev/sdc1 to attempt a fsck. After that, the first time I run fsck, it said:

# fsck /dev/sdc1
fsck from util-linux 2.27.1
e2fsck 1.42.13 (17-May-2015)
fsck.ext2: Attempt to read block from filesystem resulted in short read while trying to open /dev/sdc1
Could this be a zero-length partition?

And then running it again a few seconds later, it said:

# fsck /dev/sdc1
fsck from util-linux 2.27.1
e2fsck 1.42.13 (17-May-2015)
fsck.ext2: No such file or directory while trying to open /dev/sdc1
Possibly non-existent device?

WTF? I then ls'ed /dev

# ls /dev/sd*
/dev/sda  /dev/sda1  /dev/sda2  /dev/sdb  /dev/sdb1  /dev/sdb2  /dev/sdb3  /dev/sdc  /dev/sdd

/dev/sdc1 is gone entirely …

/dev/sdd is another drive of the same kind, Seagate 2TB SSHD. I tried smartctl on both

# smartctl -A /dev/sdc
smartctl 6.5 2016-01-24 r4214 [x86_64-linux-4.15.0-54-generic] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

Short INQUIRY response, skip product id
A mandatory SMART command failed: exiting. To continue, add one or more '-T permissive' options.

.. ok, adding permissive …

# smartctl -T permissive -A /dev/sdc
smartctl 6.5 2016-01-24 r4214 [x86_64-linux-4.15.0-54-generic] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

Short INQUIRY response, skip product id
=== START OF READ SMART DATA SECTION ===
Read defect list: asked for grown list but didn't get it

wtf?

running it on /dev/sdd (2nd drive of the same kind)

# smartctl -A /dev/sdd
smartctl 6.5 2016-01-24 r4214 [x86_64-linux-4.15.0-54-generic] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   100   100   006    Pre-fail  Always       -       360320
  3 Spin_Up_Time            0x0003   099   099   000    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always       -       15
  5 Reallocated_Sector_Ct   0x0033   100   100   036    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000f   100   253   045    Pre-fail  Always       -       9890
  9 Power_On_Hours          0x0032   076   076   000    Old_age   Always       -       21111 (206 45 0)
 10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always       -       41
184 End-to-End_Error        0x0032   100   100   099    Old_age   Always       -       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
188 Command_Timeout         0x0032   100   100   000    Old_age   Always       -       0
189 High_Fly_Writes         0x003a   100   100   000    Old_age   Always       -       0
190 Airflow_Temperature_Cel 0x0022   066   043   040    Old_age   Always       -       34 (Min/Max 28/41)
191 G-Sense_Error_Rate      0x0032   100   100   000    Old_age   Always       -       0
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       37
193 Load_Cycle_Count        0x0032   100   100   000    Old_age   Always       -       15
194 Temperature_Celsius     0x0022   034   057   000    Old_age   Always       -       34 (0 20 0 0 0)
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0
240 Head_Flying_Hours       0x0000   100   253   000    Old_age   Offline      -       21111 (165 231 0)
241 Total_LBAs_Written      0x0000   100   253   000    Old_age   Offline      -       0
242 Total_LBAs_Read         0x0000   100   253   000    Old_age   Offline      -       360320
254 Free_Fall_Sensor        0x0032   100   100   000    Old_age   Always       -       0

This one is fine.

Is my /dev/sdc dead in the water or is there something I can try to get my data back (considering that some files and folders were still accessible before I unmounted)?

It had an ext4 primary partition on the whole drive (but can't recall the starting sector or how I created it exactly).

Cheers

Best Answer

Before doing anything that might condemn your disk, try first

  • unplug / replug it from a cold shutdown

  • try another SATA cable / power plug

  • try another SATA port on the motherboard

  • if you can, try to read it from another computer

After a boot, press F2 (...) to enter the setup and check if the disk is visible from the BIOS. You can also seize that opportunity to check the BIOS settings for that port.

Check also the Seagate site, and see if someone had a similar problem with that disk.

There is also some utility you might run from Seagate. Since the disk is a SSHD, you might find a way to restore some data stored on the hard / ssd drive .