OPS matter

amazon-web-servicesiopsperformancestorage

I understand what IOPS and throughput are. Throughput measures data flow as MB/s and IOPS says how many I/O operations are happening per second.

What I don't understand is why many storage services just show the IOPS they provide. I really can't see any scenario where I would prefer to know the IOPS instead of the throughput.

Why do IOPS matter? Why does AWS mainly shows its storage provisions in IOPS? Where are IOPS more relevant than throughput (MB/s)?


EDIT:

Some people are looking into this question as if I asked what random access is and how it impacts performance or how HDD and SSD work… although I think this information is useful for people new to storage behavior, a lot of focus is being applied to this and it isn't the goal of the question, the question is about "What new piece of information do I get when I see an IOPS number, that I wouldn't get seeing a throughput (MB/s) number?"

Best Answer

Throughput

Throughput is useful when you're doing things like copying files. When you're doing almost anything else it's random reads and writes across the disk that will limit you.

IOPS

IOPS typically specify the size of each data packet. For example, AWS gp2 can do 10,000 IOPS with a 16KiB payload size. That multiplies out to 160MiB/sec. However, it's probably unlikely that you'll use the full payload size all the time, so actual throughput will probably be lower. NB KiB is 1024 bytes, KB is 1000 bytes.

Because IOPS specify a packet size that does give you total throughput as well. Whereas high throughput doesn't mean you have high IOPS.

Scenarios

Consider these scenarios:

  • Booting your PC. Consider the difference between an SSD and a spinning disk in your computer, which is something many people have first hand experience with. With a spinning disk the boot time can be a minute, whereas with an SSD this can come down to 10 - 15 seconds. This is because higher IOPS leads to lower latency when information is requested. The throughput of the spinning disk is quite good, 150MB/sec, though the SSD is likely higher this isn't why it's faster - it's the lower latency to return information.
  • Running an OS update. It's going all over the disk, adding and patching files. If you had low IOPS it would be slow, regardless of throughput.
  • Running a database, for example selecting a small amount of data from a large database. It will read from the index, read from a number of files, then return a result. Again it's going all over the disk to gather the information.
  • Playing a game on your PC. It likely loads a large number of textures from all over the disk. In this case IOPS and throughput are likely required.

LTO Tape

Consider for a moment a tape backup system. LTO6 can do 400MB/sec, but (I'm guessing here) probably can't even do one random IOP, it could be as low as seconds per IOP. On the other hand it can probably do a whole lot of sequential IOPS, if an IOPS is defined as reading or writing a parcel of data to tape.

If you tried to boot an OS off tape it would take a long time, if it worked at all. This is why IOPS is often more helpful than throughput.

To understand a storage device you probably want to know if it's random or sequential IOPS, and the IO size. From that you can derive throughput.

AWS

Note that AWS does publish both IOPS and throughput figures for all its storage types, on this page. General purpose SSD (gp2) can do 10,000 16KiB IOPS, which gives a maximum of 160MB/sec. Provisioned IOPS (io1) is 20,000 16KiB IOPS, which gives a maximum of 320MB/sec.

Note that with gp2 volumes you get 3 IOPS per GB provisioned, so to get 10,000 IOPS you need a 3.33TB volume. I don't recall if io1 volumes have a similar limitation (it's been a while since I did the associate exams where that kind of thing is tested), but I suspect they do, and if so it's probably 60IOPS per GB.

Conclusion

High sequential throughput is useful, and in some cases is the limiting factor to performance, but high IOPS is likely to be more important in most cases. You do still of course need reasonable throughput regardless of IOPS.