Why would diskspd perform better without cache

disk-cachehard driveperformancewindows-server-2012-r2

We are currently investigating high disk latency on a Windows server 2012 r2 that run as an SQL server. It is a virtual machine under VMware and the datastore of the faulty disk is linked with a very high performance LUN on a SAN.

The SAN shows very good response time for the LUN even during incidents and during my test. The datastore also shows very good response time at every moment. The cpu and memory aren't the bottleneck, I have double checked.

Microsoft suggested we use diskspd to test our disk performance. Here is the result of 2 sets of test. I have runned them a couple time, for longer interval and at different time so I am sure the result aren't incidental.

Command Line: diskspd -b64k -o32 -t4 -d60 -w50 -Sw -r -L -c20G -Z1G C:\iotest.data

Total IO
thread | bytes | I/Os | MB/s | I/O per s | AvgLat | LatStdDev | file

12623020032 | 192612 | 200.59 | 3209.46 | 38.636 | 21.687

Command Line: diskspd -b64k -o32 -t4 -d60 -w50 -Su -r -L -c20G -Z1G C:\iotest.data

Total IO
thread | bytes | I/Os | MB/s | I/O per s | AvgLat | LatStdDev | file

78517239808 | 1198078 | 1247.71 | 19963.34 | 6.410 | 8.202

According to the documentation about diskspd -Sw disable write-though IO and -Su disable software caching. Let me specify that with or without -Sw, on the first command line, the result are the same letting me know that this flag does not have much impact. From this tool(created and managed by the Windows team), we could conclude the cache (disable with -Su) is ruining disk performance but this does not seem right.

My questions are :

  • Why would software caching lower the performance ?

  • Does it impact running application the same way it impacts this test
    ?

  • IOMeter gives me the same performance as the test without software
    caching, why ?

Thanks,

Best Answer

You are changing two variables, buffered and write through, so you cannot tell what changed. A better comparison would be between -Sb "buffered" and -Su "unbuffered".

Actually, it is quite likely that -Sw is killing your performance. Any storage array worth being called such has write cache, so writes can be acknowledged faster than than they can be committed to the disks in the array. Write through can bypass that.

Keeping checking all available components on the array, SAN and host, software and hardware.