How does a SAN architecture work and more importantly scale

Architecturecontrollerhbastorage-area-network

I'm trying to understand some SAN infrastructure and I was hoping some of you with more experience than me could help me understand scaling with a SAN.

Imagine that you have some computer servers that have a HBA. They connect either directly or via a switch to a SAN Controller. The SAN Controller then provides one or more LUNs which are mapped to most likely a RAID array on a storage device.

So if I understand correctly, the "controller" represents a performance bottleneck. If you need lots of performance then you add more controllers with connections to their own storage, which then get mapped to the servers that need them.

I imagine you can get some very high performance controllers with huge storage capacities and lower performance controllers with a lower maximum performance? But if you have a switch you can then add several lower performance controllers to your network as you need them?

Please tear apart my understanding if I have it wrong, but I'm trying to work out how you connect HBAs from a server to storage without the fabric simply representing "magic".

Best Answer

The controller as a performance bottleneck is quite true, and it can represent a single-point-of-failure as well in some architectures. This has been known for quite some time. For a while there were vendor-specific techniques for working around this, but since then the industry as a whole has converged upon something called MPIO, or Multi-Path I/O.

With MPIO you can present the same LUN across multiple paths across a storage fabric. If the the server's HBA and the storage array's HBA each have two connections to the storage fabric, the server can have four separate paths to the LUN. It can go beyond this if the storage supports it; it is quite common to have dual-controller setups in the larger disk array systems with each controller presenting an active connection to the LUN. Add in a server with two separate HBA cards, plus two physically separate paths connecting the controller/HBA pairs, and you can have a storage path without single points of failure.

The fancier controllers will indeed be a full Active/Active pair, with both controllers actually talking to the storage (generally there is some form of shared cache between the controllers to help with coordination). Middle-tier devices may pretend to be active/active, but only a single device is actually performing work at any given time but the standby controller can pick up immediately should the first go silent and no I/O operations are dropped. Lower tier devices are in simple active/standby, where all I/O goes along one path, and only moves to other paths when the active path dies.

Having multiple active controllers can indeed provide better performance than a single active controller. And yes, add enough systems hitting storage and enough fast storage behind the controller, and you can indeed saturate the controllers enough that all attached servers will notice. A good way to simulate this is to cause a parity RAID volume to have to rebuild.

Not all systems are able to leverage MPIO to use multiple active paths, that's still somewhat new. Also, one of the problems that has to be solved on the part of all of the controllers is ensuring that all I/O operations are committed in-order despite the path the I/O came in on and on whatever controller received the operation. That problem gets harder the more controllers you add. Storage I/O is a fundamentally serialized operation, and doesn't work well with massive parallization.

You can get some gains by adding controllers, but the gains rapidly fade in the light of the added complexity required to make it work at all.