Electronic – RPM v read/write speed

computershard drive

I am a developer using languages like C++ and .NET. I am trying to understand how computers work at a lower level.

Say I have a hard disk that is 7400RPM and 150MB read/write speed. My book implies that these measurements are independent of each other e.g. you can have a hard disk that is 5200RPM and 150MB read/write speed as well.

Are these measurements independent of each other? What physical factors determine the read/write speed? e.g. bus width?

Best Answer

In principle, a drive is faster when running at 7200rpm than at 5400. But there are so many things to consider, that you can not compare two different drives. You also have to distinguish access time and sustained data rate:

Access time

this is the time it takes until requested data is delivered. It can be divided into:

Seek time

which is the average time it takes to move the arm with the head from one track to another. It depends on the technology used. For example, the first disks used stepper motors while today its just a coil inside a strong magnetic field. The head has to accelerate and decelerate fast and some kind of damping is needed to avoid oscillation of the head when reaching the desired track.

A disk with more platters has more heads, so more mass to move. This leads to a higher seek time.

If a physically identical device is sold with lower capacity, the number of tracks is smaller, and so the average distance the head has to move. The seek time decreases.

The acceleration of the head can be adjusted to get a loud but faster 'server' disk or quite, but slower 'desktop' disk. This setting can also be set by the user sometimes.

Latency

If the head has arrived on the desired track, it has to wait until the desired sector passes by. This depends only on the rotation speed, as on average, it has to wait half a revolution. The formula is simple:

$$t=\frac{1}{2\cdot f}$$ or more simple $$t[ms]=\frac{30000}{f[rpm]}$$

For 7200rpm, the latency is 4.2ms, for 5400rpm it is 5.6ms and for 15000rpm, it's 2ms.

You can do nothing but increasing rotation speed to lower this value.

Controller latency

Of course, the controller of the disk (and also the bus and controller on the other side of the bus) plays a role, and in the past, some drives had a poor performance which could be improved by firmware updates.

But controllers can also speed up the disk, e.g. by Native Command Queuing (NCQ). If the head is on track 1 and the disk gets a request for data from track 100 and after from track 50, the controller may decide to first read from track 50 and after from track 100. The computer has to wait a little longer for data from track 100, but overall, its faster than reading track 100 first.

Cache may also be used to reduce latency: When a full track should be read and the head arrives just when the middle of the track passes by, the disk may start to read the last half of the track, then the first, reorder the data, and send it to the computer.

Sustained data rate

This is the time to read consecutive sectors. Of course, it depends strongly on the spin speed of the disk, but there is more:

The more data can be contained on a single track, the more can be read during a single revolution of the disk. So, the easiest way is to increase the number of platters which are read out in parallel at the cost of higher latency.

For example, doubling the number of platters (theoretically) doubles the rate, while going from 5400rpm to 7200rpm does not. (And capacity has doubled, too!)

And of course, development of platter materials with higher data density goes on.

One other point may be interesting: At the outer diameter, tracks are naturally longer than at the inner. So the number of sectors per track is increased in steps from the inner to the outer diameter. So, data rate is higher at the outer diameter! I remember discussions whether to place the partition of the operating system at the "outer" part of the disk...


Edit:

I just remembered that I have once written a script (for linux) to measure the (uncached) read data rate over the entire disk. It uses

hdparm -t --offset <x> /dev/sda

where x is the position on the disk to use for the test in GB. Here is the result for my 700GB disk:

enter image description here

You can clearly see that there are steps due to the different amount of sectors per track. It's interesting that data rate is at maximum at the lower bound, so sector counting starts at the outer diameter!


Again: If a physically identical device is sold with lower capacity by not using the inner tracks, its overall data rate is higher.

One other trick: It would be bad if the first sector of all tracks would be located at the same radial position. If a large file is distributed over several tracks, the head would move to the next track after reading the last sector, and needs to wait an entire revolution until the first sector passes by (i.e. twice the latency). To avoid this, interleaved sector counting is used. The first sector of each track is located a little more "downstream", so when the head arrives, the first sector passes by shortly after.

Another: A sector contains payload and some internal data for organization, so larger sectors can contain a higher fraction of payload. This increases data rate and capacity. A negative side effect: Small files which occupy only a small part of a sector waste the rest of the sector.


As you see, disk speed is influenced by so many things that it is hard to compare two disks. As a rule of thumb, disks with higher spinning speed are faster. Also, disks with higher capacity are faster, but if you compare similar disks (or identical disks with different manufacturer settings), the rule may be wrong. Also, this are just the basics, nobody knows all the tricks manufacturers use, and how they compare.

Remark: I used the terms track and sector here. In reality, there are also clusters, and many drives do some kind of virtuaization between their inner and the outer world. Also, the computer itself can help to improve performance. A file system storing small files in some kind of data base instead of single sectors can save space. All this is not considered here.