SSD Reliability – Evaluating the Reliability of SSD Drives

hard drivehardwaressd

The main advantage of SSD drives is better performance. I am interested in their reliability.

Are SSD drives more reliable then normal hard drives? Some people say they must be because they have no moving parts, but I am concerned about the fact that this is a new technology that is possibly not completely matured yet.

Best Answer

They haven't been around long enough in enough quantities to develop an earned reputation. Flash-wear is the really big one everyone is concerned about, which is why the enterprise SSD drives allocate so many blocks to the bad-block store. Anandtech has run several articles about SSD's over the last couple months and they go into a lot of detail. From what I've read, stability problems are primarily in the consumer market where corners are being cut to bring prices down out of orbit. The SSD's you can buy to put in your fibre channel arrays are a completely different class than the OCZ drives. There is perhaps a much larger stability divide between consumer grade SSD's and enterprise SSD's than there are in consumer SATA drives and enterprise SATA drives.

For more information about enterprise SSDs like the Intel X25, Anandtech has several article about that. Their introductory article about the X25 practically gushed. On the desktop side a recent article about the OCZ Vertex went into some detail about how bad the consumer side of the SSD market really was, and linked to another article where the problem was originally identified in the tech media. In short, consumer-grade SSDs were tweaked to provide massive sequential I/O numbers with little regard to actual usage patterns. The OCZ Vertex is a consumer-grade SSD that can approach the Intel for performance, but it requires babying to get there. Again, none of these have been on the market long enough for outright failure rates to really emerge. It has only been in the last, oh, 6-8 months that consumer SSD's have gotten cheap enough for mass adoption.


Update 6/2011

Two years later, and we do have some feelings for this now. However, how they're used has evolved. SSDs are used in areas where outright performance can't be economically met with disks, so comparing reliability is something of an apples-to-pears comparison. For servers that need small storage, they usually don't also need high performance on that storage so rotational magnetic media is still used most of the time.

That said, some comparisons can be drawn. SSD are typically used in large storage arrays as the highest tier of performance. In this role I've heard anecdotal reports that SSDs last a lot shorter than the same disks in those arrays. Like, on the order of 10-18 months. This is reflected in the warranty the big storage vendors allow on SSDs.

This may look like "a lot less reliable", but in reality you have to look at it right. Modern top tier SSDs can handle I/O Operations per second into the six digits these days, reaching the performance of even one drive with 15K RPM disks will take well over a hundred spindles. More mid-grade SSDs can do 30-50K I/O Ops, which is still over a hundred 15K disks. Modern disk I/O systems can't keep up with speeds like this, which is why the big array vendors only allow a few SSDs per array relative to disks; they simply can't eek enough performance out of the entire system to keep those things fed.

So in reality, we're comparing a brace of (for example) 8 mid-grade SSDs versus 250 15K drives. Since this is enterprise storage, give them an 80% duty cycle. In the first year a couple of those 15K drives will definitely fail requiring replacement, possibly up to 20. Anecdotaly, half of the SSDs will fail. When looked at it like this, failure rate for performance given, SSDs still aren't up to HDs. When looked at it from an economic point of view, each SSD is worth 31.25 HDs, SSDs are markedly cheaper for the performance given so the increased failure rate is more acceptable since replacement-rate is still probably cheaper in the long run.

Looking at it another way, a direct apples-to-apples comparison, where you subject the same two devices to identical I/O loads over a period of time, SSDs are more reliable these days. Take a 15K drive and a mid-grade SSD (50K I/O Ops/s) and give them both a steady diet of 180 I/O ops, and it is more likely that the SSD will make it to 5 years without fault than the HD. It's a statistical dance to be sure, but that's where things are going now.

Hard-drives still have the edge in the drive-unit failure rate per GB of storage provided. However, this is not a market segment that SSD are intended to be competitive.