This is a setup I inherited, and that is really old (running drbd 8.3). I tried drbdadm connect drbd0
, drbdadm primary -f drbd0
, but everything comes back with Need access to UpToDate data
.
I presume that is because of being Inconsistent.
[root@node-01 ~]# drbd-overview
0:drbd0 StandAlone Secondary/Unknown Inconsistent/Outdated r-----
1:drbd1 Connected Secondary/Secondary UpToDate/UpToDate C r-----
[root@node-02 ~]# drbd-overview
0:drbd0 WFConnection Secondary/Unknown Inconsistent/DUnknown C r-----
1:drbd1 Connected Secondary/Secondary UpToDate/UpToDate C r-----
How can I fix this, without nuking the data on it?
When I did drbdadm connect drbd0
the system log says:
block drbd0: conn( StandAlone -> Unconnected )
block drbd0: Starting receiver thread (from drbd0_worker [6860])
block drbd0: receiver (re)started
block drbd0: conn( Unconnected -> WFConnection )
block drbd0: Handshake successful: Agreed network protocol version 96
block drbd0: conn( WFConnection -> WFReportParams )
block drbd0: Starting asender thread (from drbd0_receiver [21821])
block drbd0: data-integrity-alg: <not-used>
block drbd0: drbd_sync_handshake:
block drbd0: self AA586D9040BXXXX:7DF55F42BF95XXXX:7DF45F42BF95XXXX:DC31D449C727XXXX bits:416 flags:0
block drbd0: peer 7DF55F42BF9XXXX:0000000000000000:DC31D449C727EE27:DC30D449C727XXXX bits:416 flags:0
block drbd0: uuid_compare()=1 by rule 70
block drbd0: I shall become SyncSource, but I am inconsistent!
block drbd0: conn( WFReportParams -> Disconnecting )
block drbd0: error receiving ReportState, l: 4!
block drbd0: asender terminated
block drbd0: Terminating asender thread
block drbd0: Connection closed
block drbd0: conn( Disconnecting -> StandAlone )
block drbd0: receiver terminated
block drbd0: Terminating receiver thread
Best Answer
Neither node has
UpToDate
data, so DRBD will not be able to goPrimary
without some convincing. You'll need to force a node into Primary.Which ever node you run the following command on should become the
SyncSource
, so be sure you choose the node you believe to have good data.drbdadm -- --overwrite-data-of-peer primary <resource>
If you're not sure, I would
disconnect
the resource on both nodes so they're bothStandAlone
, run the above command on one node, promote that node toPrimary
, and then inspect the data. Then repeat on the other node. Once you know where the good data is, you can demote both sides and resolve the split-brain in the correct direction by telling the split-brain victim to discard his data using:drbdadm -- --discard-my-data connect <resource>
, and simply connecting the split-brain survivor:drbdadm connect <resource>
.Hope that helps!