Linux – kernel: journal commit I/O error

ext3linuxraid

I am having some problems with a Dell 1950 server. I am installing RHEL 4.6 along with Oracle and some other software on here.

I am randomly getting an error message saying "kernel: journal commit I/O error" on my ssh session and on the monitor I have hooked up to the server I see an error scrolling by that says "EXT3-fs error (device sda5) in start_transaction: Journal has aborted."

It has happened several times but never at the same point during the install. Actually, this last time the system was up and running and I was just trying to import a database into oracle.

This has happened on several hard drives, so I'm pretty sure that is not the problem. This makes me think the raid controller is going bad.

What do you guys think?

** UPDATE **

Pretty sure it was a bad hard drive. I threw another drive in the server and it's been running for about 48 hours with out problems.

Best Answer

I've seen those errors before, but not during the install process.

It means that the drive got enough errors that the OS took it to read-only mode. If you could find the full logs, there'd probably be some I/O errors that retried and worked before the full-on failure errors you saw. Something with actual blocks mentioned.

It's a storage system error. It's definitely the RAID card, the drives in the RAID array, the cables from the card to the drives, the backplane the drives connect to, the slot the raid card is plugged into, the power supply for the hard drives, or something else in between the CPU and the actual storage blocks.