Redhat – Automatic recovery with discarded data from NetworkFailure-interrupted DRBD sync

drbdhigh-availabilityredhat

Suppose I have two DRBD devices provisioned. When the second node connects, it syncs the data from the first (primary/master) node.

During this sync, the primary node loses power.

After the Primary node is lost and the original Secondary is the only available node the Secondary node is in Inconsistent/DUnknown state.

Is there any way to recover from this automatically?

version: 8.4.7 (api:1/proto:86-101)
srcversion: 0904DF2CCF7283ACE07D07A

 1: cs:WFConnection ro:Secondary/Unknown ds:Inconsistent/DUnknown C r-----
    ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:390452

I can recover from this situation manually by running drbdadm promote --force <resource-name> and then (this is in a pacemaker cluster) pcs resource cleanup but I am looking for a way to automatically trigger this recovery.

Full logs of an example

[   20.233788] drbd: initialized. Version: 8.4.7 (api:1/proto:86-101)
[   20.234905] drbd: srcversion: 0904DF2CCF7283ACE07D07A
[   20.235791] drbd: registered as block device major 147
[   22.402786] drbd shareddata: Starting worker thread (from drbdsetup-84 [1406])
[   22.406433] block drbd1: disk( Diskless -> Attaching )
[   22.407422] drbd shareddata: Method to ensure write ordering: flush
[   22.408478] block drbd1: max BIO size = 4096
[   22.409211] block drbd1: drbd_bm_resize called with capacity == 2097016
[   22.410317] block drbd1: resync bitmap: bits=262127 words=4096 pages=8
[   22.411492] block drbd1: size = 1024 MB (1048508 KB)
[   22.413787] block drbd1: recounting of set bits took additional 0 jiffies
[   22.414922] block drbd1: 1024 MB (262127 bits) marked out-of-sync by on disk bit-map.
[   22.416189] block drbd1: Suspended AL updates
[   22.416942] block drbd1: disk( Attaching -> UpToDate )
[   22.418403] block drbd1: attached to UUIDs 9FB19F9A9D6573A9:0000000000000004:0000000000000000:0000000000000000
[   22.460721] drbd shareddata: conn( StandAlone -> Unconnected )
[   22.462303] drbd shareddata: Starting receiver thread (from drbd_w_sharedda [1407])
[   22.467153] drbd shareddata: receiver (re)started
[   22.468715] drbd shareddata: conn( Unconnected -> WFConnection )
[   23.000120] drbd shareddata: Handshake successful: Agreed network protocol version 101
[   23.003987] drbd shareddata: Feature flags enabled on protocol level: 0x7 TRIM THIN_RESYNC WRITE_SAME.
[   23.008195] drbd shareddata: conn( WFConnection -> WFReportParams )
[   23.010706] drbd shareddata: Starting ack_recv thread (from drbd_r_sharedda [1467])
[   23.067880] block drbd1: max BIO size = 1048576
[   23.069557] block drbd1: drbd_sync_handshake:
[   23.070869] block drbd1: self 9FB19F9A9D6573A8:0000000000000004:0000000000000000:0000000000000000 bits:262127 flags:0
[   23.073539] block drbd1: peer 3B5A831140811725:0000000000000004:0000000000000000:0000000000000000 bits:262127 flags:0
[   23.076210] block drbd1: uuid_compare()=100 by rule 90
[   23.077596] block drbd1: helper command: /sbin/drbdadm initial-split-brain minor-1
[   23.081505] block drbd1: helper command: /sbin/drbdadm initial-split-brain minor-1 exit code 0 (0x0)
[   23.084035] block drbd1: Split-Brain detected, 1 primaries, automatically solved. Sync from peer node
[   23.086539] block drbd1: peer( Unknown -> Primary ) conn( WFReportParams -> WFBitMapT ) disk( UpToDate -> Outdated ) pdsk( DUnknown -> UpToDate )
[   23.089588] block drbd1: Resumed AL updates
[   23.103227] block drbd1: receive bitmap stats [Bytes(packets)]: plain 0(0), RLE 21(1), total 21; compression: 100.0%
[   23.105986] block drbd1: send bitmap stats [Bytes(packets)]: plain 0(0), RLE 21(1), total 21; compression: 100.0%
[   23.108662] block drbd1: conn( WFBitMapT -> WFSyncUUID )
[   23.127823] block drbd1: updated sync uuid 68A55F3E62EDE97C:0000000000000000:0000000000000000:0000000000000000
[   23.136222] block drbd1: helper command: /sbin/drbdadm before-resync-target minor-1
[   23.140260] block drbd1: helper command: /sbin/drbdadm before-resync-target minor-1 exit code 0 (0x0)
[   23.142823] block drbd1: conn( WFSyncUUID -> SyncTarget ) disk( Outdated -> Inconsistent )
[   23.145214] block drbd1: Began resync as SyncTarget (will sync 1048508 KB [262127 bits set]).
[   61.912243] drbd shareddata: PingAck did not arrive in time.
[   61.914470] drbd shareddata: peer( Primary -> Unknown ) conn( SyncTarget -> NetworkFailure ) pdsk( UpToDate -> DUnknown )
[   61.919882] drbd shareddata: ack_receiver terminated
[   61.921491] drbd shareddata: Terminating drbd_a_sharedda
[   61.968612] drbd shareddata: Connection closed
[   61.970170] drbd shareddata: conn( NetworkFailure -> Unconnected )
[   61.971855] drbd shareddata: receiver terminated
[   61.973304] drbd shareddata: Restarting receiver thread
[   61.974743] drbd shareddata: receiver (re)started
[   61.976187] drbd shareddata: conn( Unconnected -> WFConnection )
[   62.008237] block drbd1: State change failed: Need access to UpToDate data
[   62.010446] block drbd1:   state = { cs:WFConnection ro:Secondary/Unknown ds:Inconsistent/DUnknown r----- }
[   62.013170] block drbd1:  wanted = { cs:WFConnection ro:Primary/Unknown ds:Inconsistent/DUnknown r----- }
[   76.334863] drbd shareddata: conn( WFConnection -> Disconnecting )
[   76.336529] drbd shareddata: Discarding network configuration.
[   76.338082] drbd shareddata: Connection closed
[   76.339375] drbd shareddata: conn( Disconnecting -> StandAlone )
[   76.340898] drbd shareddata: receiver terminated
[   76.342203] drbd shareddata: Terminating drbd_r_sharedda
[   76.343712] block drbd1: disk( Inconsistent -> Failed )
[   76.364417] block drbd1: 560 MB (143363 bits) marked out-of-sync by on disk bit-map.
[   76.366742] block drbd1: disk( Failed -> Diskless )
[   76.404579] drbd shareddata: Terminating drbd_w_sharedda

Best Answer

If you don't care about the data, why replicate it in the first place? ;)

Since this is the initial sync, your Secondary node will have Inconsistent data until the sync completes. Up until that point, you'll always have to force promote the Secondary into Primary, which isn't a great thing to be doing.

Why not skip the initial sync, and then use DRBD's LVM snapshot before-resync-target handler to protect against this scenario moving forward?

To skip the initial sync, once you stand up a new device on both nodes, and they are cs:Connected and ds:Inconsistent/Inconsistent, clear the bitmap making the current state "consistent" (from one node, not both):

# drbdadm new-current-uuid --clear-bitmap all

Then, use DRBD's before-resync-target/after-resync-target handlers to take/remove snapshots of your backing LVM device before/after the resyncs so you always have a consistent dataset in case a failure does occur during a resync:

resource <resource> {
...
  handlers {
    before-resync-target "/usr/lib/drbd/snapshot-resync-target-lvm.sh";
    after-resync-target "/usr/lib/drbd/unsnapshot-resync-target-lvm.sh";
  }
}

You'd then be able to recover the snapshot using lvconvert just like any other lvm snapshot.

Related Topic