Performance IO – Are Networks Now Faster Than Disks?

ioperformance

This is a software design question

I used to work on the following rule for speed

cache memory > memory > disk > network

With each step being 5-10 times the previous step (e.g. cache memory is 10 times faster than main memory).

Now, it seems that gigabit ethernet has latency less than local disk. So, maybe operations to read out of a large remote in-memory DB are faster than local disk reads. This feels like heresy to an old timer like me. (I just spent some time building a local cache on disk to avoid having to do network round trips – hence my question)

Does anybody have any experience / numbers / advice in this area?

And yes I know that the only real way to find out is to build and measure, but I was wondering about the general rule.

edit:

This is the interesting data from the top answer:

  • Round trip within same datacenter 500,000 ns

  • Disk seek 10,000,000 ns

This is a shock for me; my mental model is that a network round trip is inherently slow. And its not – its 10x faster than a disk 'round trip'.

Jeff attwood posted this v good blog on the topic http://blog.codinghorror.com/the-infinite-space-between-words/

Best Answer

Here are some numbers that you are probably looking for, as quoted by Jeff Dean, a Google Fellow:

Numbers Everyone Should Know

L1 cache reference                             0.5 ns
Branch mispredict                              5 ns
L2 cache reference                             7 ns
Mutex lock/unlock                            100 ns (25)
Main memory reference                        100 ns
Compress 1K bytes with Zippy              10,000 ns (3,000)
Send 2K bytes over 1 Gbps network         20,000 ns
Read 1 MB sequentially from memory       250,000 ns
Round trip within same datacenter        500,000 ns
Disk seek                             10,000,000 ns
Read 1 MB sequentially from network   10,000,000 ns
Read 1 MB sequentially from disk      30,000,000 ns (20,000,000)
Send packet CA->Netherlands->CA      150,000,000 ns

It's from his presentation titled Designs, Lessons and Advice from Building Large Distributed Systems and you can get it here:

The talk was given at Large-Scale Distributed Systems and Middleware (LADIS) 2009.

Other Info


It's said that gcc -O4 emails your code to Jeff Dean for a rewrite.