DRBD – Fix Inconsistent and Outdated Nodes

drbd

This is a setup I inherited, and that is really old (running drbd 8.3). I tried drbdadm connect drbd0, drbdadm primary -f drbd0, but everything comes back with Need access to UpToDate data.

I presume that is because of being Inconsistent.

[root@node-01 ~]# drbd-overview
  0:drbd0  StandAlone Secondary/Unknown   Inconsistent/Outdated r-----
  1:drbd1  Connected  Secondary/Secondary UpToDate/UpToDate     C      r-----

[root@node-02 ~]# drbd-overview
  0:drbd0  WFConnection Secondary/Unknown   Inconsistent/DUnknown C r-----
  1:drbd1  Connected    Secondary/Secondary UpToDate/UpToDate     C r-----

How can I fix this, without nuking the data on it?

When I did drbdadm connect drbd0 the system log says:

block drbd0: conn( StandAlone -> Unconnected )
block drbd0: Starting receiver thread (from drbd0_worker [6860])
block drbd0: receiver (re)started
block drbd0: conn( Unconnected -> WFConnection )
block drbd0: Handshake successful: Agreed network protocol version 96
block drbd0: conn( WFConnection -> WFReportParams )
block drbd0: Starting asender thread (from drbd0_receiver [21821])
block drbd0: data-integrity-alg: <not-used>
block drbd0: drbd_sync_handshake:
block drbd0: self AA586D9040BXXXX:7DF55F42BF95XXXX:7DF45F42BF95XXXX:DC31D449C727XXXX bits:416 flags:0
block drbd0: peer 7DF55F42BF9XXXX:0000000000000000:DC31D449C727EE27:DC30D449C727XXXX bits:416 flags:0
block drbd0: uuid_compare()=1 by rule 70
block drbd0: I shall become SyncSource, but I am inconsistent!
block drbd0: conn( WFReportParams -> Disconnecting )
block drbd0: error receiving ReportState, l: 4!
block drbd0: asender terminated
block drbd0: Terminating asender thread
block drbd0: Connection closed
block drbd0: conn( Disconnecting -> StandAlone )
block drbd0: receiver terminated
block drbd0: Terminating receiver thread

Best Answer

Neither node has UpToDate data, so DRBD will not be able to go Primary without some convincing. You'll need to force a node into Primary.

Which ever node you run the following command on should become the SyncSource, so be sure you choose the node you believe to have good data.

drbdadm -- --overwrite-data-of-peer primary <resource>

If you're not sure, I would disconnect the resource on both nodes so they're both StandAlone, run the above command on one node, promote that node to Primary, and then inspect the data. Then repeat on the other node. Once you know where the good data is, you can demote both sides and resolve the split-brain in the correct direction by telling the split-brain victim to discard his data using: drbdadm -- --discard-my-data connect <resource>, and simply connecting the split-brain survivor: drbdadm connect <resource>.

Hope that helps!