iSCSI with two DRBD primary nodes is a bad idea to use if the two paths get concurrent write requests. But I am thinking about using this idea as backend storage for an ESXi 5.5U2 host.

I already did test this with primary/secondary configurations and a classical failover-cluster.

What ESXi does at this point is that it detects a multipath und uses only one path actively. So in this constellation the concurrent write io-problem does not seem to arise.

Now the problem in both cases (primary/secondary or primary/primary) is: How do I shutdown an iSCSI server (iSCSI target provider in iSCSI terms) that has active open connections to an iSCSI client (iSCSI initiator in iSCSI terms)?

I currently use CentOS 5 on the target servers.

CO5 uses tgtd to provide the targets. To my astonishment the normal stop method fails, if there are connected clients. Instead the forcedstop seems to be what I need in this case.

I want to shutdown one server cleanly (I have to stop access to the target, so I can switch drbd to secondary) and the other server should then automatically become active (nothing to do there in this constellation IMHO).

Questions in that context:
Is the following ok, or am I missing something?

  1. forced stop of tgtd (will first offline the targets)
  2. tear down IP into the direction of the initiator (different line than that used for drbd-replication)
  3. shutdown drbd (making it secondary first)
  4. reboot or shutdown server

Best Answer

Yes, I did miss something. The problem is still that the underlying protocol (SCSI) is a stateful protocol. So even if I manage to shutdown the target (e.g. with forced stop) it will leave the activie initiators in a "hanging" state.

But: In my use-case there is a solution to the problem.

  1. in vCenter disable all paths to a certain iSCSI-Server.
  2. That will orderly terminate all open iSCSI-transactions and will open new transactions on the other path to the other server.
  3. After that the iSCSI-Server can be safely rebooted without client interruption.
  4. After the iSCSI-Server is up and running again the original iSCSI-paths can be reactivated by enabling theses paths in vCenter.

So the proper answer to my questions seems to be:

Short: There is no proper way. Your clients will hang.

Long: It depends. If you have got a layer in between that is able to properly silence/terminate the iSCSI-traffic first, you can terminate the target afterwards (even if the target server still thinks that there are connected initiator clients).