How to upgrade an iSCSI Switch stack with minimal downtime

delldell-equallogicdell-powerconnectiscsi

We have several Dell 6248 swtich stacks (of 3 each) that form the backbone of our iSCSI Storage network. We need to perform firmware updates on the switch stacks, but are concerned about the downtime required.

By way of information our storage is exclusively Dell/Equallogic PS6000 series enclosures with 3 or 4 GigE uplinks per enclosure.

As you might already know, these switches can't be upgraded a member at a time, and the reboot required to upgrade the switches is on the order of two minutes (i.e. longer than the iSCSI initiator timeout for a volume).

Does anyone have any suggestions for how we might be able to accomplish a iSCSI SAN switch stack upgrade while minimizing downtime?

Thanks for any help or suggestions.

Joe

Best Answer

If your core iSCSI network has been setup correctly for Equallogic you should have two separate stacks, with normal ISLs connecting the two stacks, and all arrays and hosts should have at least one connection to each stack. If that is the case then the simplest and lowest impact approach will be to follow the standard Dell firmware update procedure for stacked PowerConnects with its 2+ minute per switch timescale. You shouldn't experience any actual downtime if the cabling has been done properly but performance will be significantly degraded so you should only do this when everything is quiet. I'd be double checking that all of the connections are OK first though because you will definitely be relying on many single links keeping things alive while the upgrade happens.

Breaking out PowerConnect switches from the stack and upgrading them individually might be possible but you will have to go through a very intricate process to ensure that each switch upgrade happens in isolation and you have to be very careful reconnecting the upgraded switches because they can't be stacked until all units are at the same version. You will possibly have to recreate switch configs for most of the switches if you take this route. You will also have to ensure that all active switches have some reasonably high bandwidth connectivity to both stacks when you bring them online - that is an Equallogic requirement that seriously complicates this sort of exercise. If you end up in a scenario where one switch appears to be active as far as the arrays are concerned but is isolated from either stack then at best you will have some serious performance problems and at worst all volumes hosted from the arrays connected to that switch may go offline. I really wouldn't want to do it that way to be honest, far too many points at which it could go wrong.