Why do sequential writes have better performance than random writes on SSDs

benchmarkhard driveioperformancessd

An LBA (logical block addresses) is a mapping table implemented in the FTL to match between logical and physical pages/blocks in SSDs, my guess is that most SSDs (at least when they are empty) keeps the physical addresses in the same order as the logical ones (the physical address 0 is mapped with the logical address 0, 1 with 1 and so on).

When a page is changed the SSD controller copy the updated page to the cache, changes the page, mark the old one as 'none valid / stale' and then write the new one in a different location and update the LBA.

So, after a couple of writes even if the physical addresses were aligned with the logical ones, this order will be messed up!

Why does sequential writes have better performance than random writes then?

Edit

The lack of performance between sequential and random writes was regardless of the block size or the queue depth.

Best Answer

A reasonably concise explanation by Seagate on how garbage collection is responsible for the difference in SSD performance for random versus sequential writes:

... the need for garbage collection affects an SSD’s performance, because any write operation to a “full” disk (one whose initial free space or capacity has been filled at least once) needs to await the availability of new free space created through the garbage collection process. Because garbage collection occurs at the block level, there is also a significant performance difference, depending on whether sequential or random data is involved. Sequential files fill entire blocks, which dramatically simplifies garbage collection. The situation is very different for random data.

As random data is written, often by multiple applications, the pages are written sequentially throughout the blocks of the flash memory.
The problem is: This new data is replacing old data distributed randomly in other blocks. This causes a potentially large number of small “holes” of invalid pages to become scattered among the pages still containing valid data. During garbage collection of these blocks, all valid data must be moved (i.e. read and re-written) to a different block.
By contrast, when sequential files are replaced, entire blocks are often invalid, so no data needs to be moved. Sometimes a portion of a sequential file might share a block with another file, but on average only about half of such blocks will need to be moved, making it much faster than garbage collection for randomly-written blocks. ...