Failover Cluster file server performance issue with Windows Server 2016

configurationfailovernetwork-shareperformancewindows-server-2016

I've run into an interesting file server transfer speed performance issue with a failover cluster I recently configured in Server 2016. The specific issue is that when I access the file share from the clustered storage path (e.g. \\store01\share01) file transfer speed (writes in particular, it seems) is way, way slower than when I access it via the local path on the current owner node (e.g. \\srv04\e$\Shares\Share01).

For example, I copied 499 .txt files (totaling 26.07 MB) using Robocopy:

  • \\srv04\e$\Shares\Share01: 0:0:03 – 635 MB/min

  • \\store01\share01: 0:02:20 – 11.286 MB/min

This is an issue regardless of the current owner node or where the data is transferred from. Although I didn't follow it at the time, I more or less installed and configured the service as indicated in this guide. I've tried messing with a few settings, but they're all back to default (as far as I know). I've looked around a bit and haven't found anything specifically mentioning a huge performance issue with using a failover cluster, so I've been doing some random research without much to show for it.

Few things about the configuration that might be relevant:

  • The cluster currently has two nodes. Both are running Server 2016 and both have two Nic Teams (configured in Windows, Switch Independent) consisting of two 1Gbit connections each.
  • The actual storage being used is a Synology that both machines are accessing via iSCSI, configured using these instructions.
  • Everything else seems to work fine, in the way that simulating a failover works and the other node takes over a few seconds later.

I'm guessing this is one of those "obvious to anybody who knows more than I do" sort of situations. Or Maybe I'm just hoping for that. Either way, I appreciate any guidance! I tried to keep it short, so please let me know if you need any other information.

Thanks in advance.

Best Answer

Your first issue is NICs teamed for iSCSI. You never do that unless your both target and initiator support multiple connections per session and in your case neither of them does.

https://www.starwindsoftware.com/blog/lacp-vs-mpio-on-windows-platform-which-one-is-better-in-terms-of-redundancy-and-speed-in-this-case-2

http://scst.sourceforge.net/mc_s.html

Solution: you have to un-team your NICs and use MPIO.

Your second issue is Synology itself. It's not what you use for primary storage, it's backup unit at best.

Solution: you copy your content to local disks and use Synology as backup repository or whatever.