Linux – LSI RAID controller errors on DB import – How to troubleshoot

hardware-raidlinuxoraclerhel5

We're running an import of a database dump on an Oracle system – (RHEL 5.9, 2.6.18-348.6.1.el5). The import does not complete, eventually erroring-out with:

ORA-15080: synchronous I/O operation to a disk failed
WARNING: failed to write mirror side 1 of virtual extent 248 logical extent 0 of file 280 in group 1 on disk 1 allocation unit 986
Errors in file /u01/app/oracle/diag/rdbms/dbprod/DBPROD/trace/DBPROD_lgwr_24520.trc:
ORA-00345: redo log write error block 509314 count 2023
ORA-00312: online log 1 thread 1: '+DATA/dbprod/redo01.log'
ORA-15081: failed to submit an I/O operation to a disk
ORA-15081: failed to submit an I/O operation to a disk

There are corresponding errors in the ring buffer and /var/log/messages:

Jun 12 18:54:42 db1-test kernel: megasas: build_ld_io  error, sge_count = 51
Jun 12 18:54:42 db1-test kernel: megasas: Err returned from build_and_issue_cmd
Jun 12 18:54:42 db1-test kernel: megasas: build_ld_io  error, sge_count = 51
Jun 12 18:54:42 db1-test kernel: megasas: Err returned from build_and_issue_cmd
Jun 12 18:54:42 db1-test kernel: megasas: build_ld_io  error, sge_count = 51
Jun 12 18:54:42 db1-test kernel: megasas: Err returned from build_and_issue_cmd
Jun 12 18:54:42 db1-test kernel: sd 0:2:1:0: timing out command, waited 360s
Jun 12 18:54:42 db1-test kernel: sd 0:2:1:0: Unhandled error code
Jun 12 18:54:42 db1-test kernel: sd 0:2:1:0: SCSI error: return code = 0x06000000
Jun 12 18:54:42 db1-test kernel: Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT,SUGGEST_OK

The drive array containing the import is a 10-disk SAS array in RAID 1+0 using 300GB 10k disks. The RAID controller is an LSI MegaRAID SAS 9260-8i. No disk or adapter errors are reported via MegaCLI.

Is this a hardware issue?
Is there any way to troubleshoot? The RAID controller status is fine. The disks and logical drives report healthy.
Is this a Linux OS or tuning issue? I'll try with different I/O schedulers to be sure. CFQ is default.

Edit:

Other schedulers have been tried with the same result. There is a third-party (Vormetric) filesystem encryption module running in this setup. Removing it allows the import to complete. So now I'm wondering if this is a deficiency in the module or if it is triggering a bad condition in the LSI driver.

During the import, we're hitting 14,000 write IOPS.
enter image description here

In recent attempts, the system stalls entirely with the following on the console.
enter image description here

Last top output before freeze.
enter image description here

Best Answer

Ultimately Sergey is right - this is a driver problem. But let's check things out first:

First off you'll want to use the deadline I/O scheduler rather than CFQ. deadline, as its name implies, ensures that all IOPs complete in a timely manner.

Grab the events from the megaraid card:

megacli -adpeventlog -getevents -f /tmp/megaraid-$(date +%F_%T) -aALL

Check the SMART data on the disks (you will need to build a new smartmontools for this to work):

# megacli -pdlist -a0 |grep 'Device Id'
Device Id: 10
Device Id: 9

# smartctl -a /dev/sda -d megaraid,9
«…»
# smartctl -a /dev/sda -d megaraid,10
«…»

If everything looks OK, go ahead and try out the latest driver from LSI.

There is a third-party (Vormetric) filesystem encryption module running in this setup. Removing it allows the import to complete. So now I'm wondering if this is a deficiency in the module or if it is triggering a bad condition in the LSI driver.

The Voretric module is likely doing something incompatible, yes. I would start by talking with them about how their module is screwing up your system under high load.

Related Solutions

Linux – mdadm raid1 fails to resync

You shouldn't attempt to prepare the new drive in any meaningful way unless your raid constituents are actually disk PARTITIONS not disks themselves. In which case, you would create a partition on the new drive that is the same size as the one on the remaining active disk.

You never need to touch the old drive at all -- it's assumed to be failed and unreliable.

The correct procedure is to remove the broken drive, add a new, empty drive, and then use mdadm to add that new drive to the array. You'd do it something like this:

mdadm --add /dev/md0 /dev/<newdrive>

The kernel will then sync the new drive into the array, copying the data from the one remaining good drive.

Raid1, one disk displaying SMART errors:

Actually, your answer is not completely correct. in all likelihood you do not want to break the RAID.

First, if the system is under warranty, call HP. If it is close to still under warranty, I would call and see if they will cut you a break. Make sure to tell them that you are actually using the hardware RAID.

1/ Get a replacement drive. If the system is under warranty, HP should send one. If not, go buy one or order one online. The replacement should be the same disk if at all possible. If not, it needs to the the same size or bigger.

2/ If you haven't already, take a backup. At the very least, get a Dropbox or SugarSync account and get a copy of your important stuff off the machine.

3/ If you don't have one, create a Recovery Disk. The specific procedure depends on your OS.

4/ If you haven't already, figure out specifically which drive has failed. The error codes might make it clear, or the raid array management utility might tell you.

5/ I presume it is not a hot-swappable RAID controller. So, turn off the system, swap the failed drive for the new drive, and start the system. Go into whichever tool you used to create the RAID set, and confirm that it sees the new drive and is rebuilding.

I would not reuse the bad drive in any production system. You could run SpinRite on it and possibly resolve the issue and keep it around a cold spare, but I wouldn't.

Best Answer

Related Solutions

Linux – mdadm raid1 fails to resync

Raid1, one disk displaying SMART errors:

Related Topic