Electronic – Why erasing SSDs in increments of entire blocks does not damage adjacent cells

flashssd

I know that to avoid corruption and damage, SSDs are only erased in increments of entire blocks. But how does that not damage adjacent cells? Why energy-intensive tunneling inhibition isn't necessary when you're whacking a whole bunch of cells at the same time?

Best Answer

It's not to avoid corruption or damage. Erasing in blocks is part of the definition of flash memory. Flash memory is descended from EEPROMs -- EEPROM + block erase = flash. There are a few reasons why this is good:

The mechanism used for erase (Fowler-Nordheim tunneling) is slow. In NOR flash, it's possible to use much faster mechanisms for programming like hot electron injection. We're talking microseconds vs. hundreds of milliseconds, a factor of ~100,000. In NAND flash, F-N tunneling is used for word programming as well, but at least one operation is faster.
Conceptually, erasing is more likely to be a block operation anyway. This is true for microcontrollers (where "re-flashing" means a full erase/reprogram), as well as SSDs (you can append data to a file one byte at a time, but deleting the file removes all of the data at once). Overwriting only a little data in an existing file is much less common.
Erase uses high voltages, so erasing in blocks requires less die area to implement bit- or word-selective switching. This is also true for programming, but you have to have bit-selective programming, so there's no way around it.

If you post a source, I can comment more on issues of corruption and damage. F-N tunneling does cause long-term oxide damage, which slows down erasing (and programming in NAND flash). It's also possible for a program or erase operation to corrupt ("disturb") nearby bits. Finally, in stacked-gate flash transistors, it's possible for over-erasure to cause corrupt reads, but I don't know how common that is in NAND flash.

EDIT: The Ars Technica article you linked to contains the following paragraph:

While SSDs can read and write to individual pages, they cannot overwrite pages. A freshly erased, blank page of NAND flash has no charges stored in any of its floating gates; it stores all 1s. 1s can be turned into 0s at the page level, but it's a one-way process (turning 0s back into 1s is a potentially dangerous operation because it uses high voltages). It's difficult to confine the effect only to the cells that need to be altered; the high voltages can cause changes to adjacent cells. This can be prevented with tunneling inhibition—you apply a very large amount of voltage to all the surrounding cells so that their electrons don't tunnel away along with the targeted cells—but this results in no small amount of stress on the cells being erased. Consequently, in order to avoid corruption and damage, SSDs are only erased in increments of entire blocks, since energy-intensive tunneling inhibition isn't necessary when you're whacking a whole bunch of cells at the same time. (There's a Mafia joke in here somewhere, I'm sure of it.)

I've only worked on NOR flash, so I can't say for sure whether there's something special about NAND flash that causes erase to be harmful. This presentation from Micron suggests that erase is done in blocks because the erase voltage is applied to the P-well, and having a separate P-well for each transistor string would take up too much space. As you can see in the presentation, inhibition is used for programming, and has the same problems that are mentioned in the Ars Technica article. The erase and program voltages are similar, as one would expect. I suspect the Ars writer got confused about the significance of these points:

Due to the design of the flash array, voltages can only be applied to entire rows (gates), strings (channels), or blocks (P-wells) of transistors.
This includes the high voltages used to induce F-N tunneling for program/erase.
If you don't want to program every bit in the selected row, you need to apply an inhibit voltage to some of them.
One component of the inhibit voltage is a medium-high voltage applied to the gates of every unselected row (wordline).
In strings containing bits that you're programming, this medium-high voltage acts as a weak programming voltage for the unselected bits.
Over time, this weak programming can gradually flip a bit that's supposed to be erased.
If you had selective erase, you'd have to do a similar sort of inhibition to protect the unselected bits.

I doubt that erase inhibition would be any more harmful than program inhibition, and they'd probably balance each other out anyway. But selectable erase with inhibition would take up more die space, and thus drive up the price. Given that flash memory was around long before NAND SSDs, any quality improvement from block erase would be a happy accident, not a modern design choice.

Again, I've only worked on NOR flash, so if there's anyone out there with NAND experience, please feel free to chime in.

Best Answer

Related Solutions

Electronic – Why does NAND erase only at block-level and not page level

Electronic – Why don’t SSDs use the +12V rail for programming/erasing