Design – How to solve issue of consistency in concurrent and distributed application (built around Bankers Dilemma)

concurrencydesigndistributed-system

This is a classic problem which I'm sure has been solved many times by many different people. I don't have any formal training (I've not studied computer science or any other such academic subject) and so I'm not sure of the best way to solve the problem I'm about to describe.

If we imagine the below diagram is an example of a bankers dilemma (two users Foo and Bar have access to a single bank account: Baz). What is the expected behaviour when following one of the paths shown?

Note: I'm assuming we're using a mutex (or some other form of synchronisation) on the Baz variable.

Example 1: Baz initially holds the value 10. If Foo writes a new value (which is the result of removing 5 from the current value) before Bar; then Bar will end up taking 10 from the new value 5, leaving a minus balance (i.e. the final value will be -5). Meaning more money has been taken than available.

Example 2: Baz initially holds the value 10. If Bar writes a new value (which is the result of removing 10 from the current value) before Foo; then Foo will end up taking 5 from the new value 0, leaving a minus balance (i.e. the final value will be -5). Meaning more money has been taken than available.

Both actions (Foo (-5) and Bar (-10)) are triggered at the same time. So how do we ensure that either Foo or Bar is alerted to the fact that their transaction cannot be completed (as there are not enough funds for it to succeed)?

It seems a potential solution is to ensure the caller executes a method that uses a mutex internally to lock the value first; then once the value is locked we can read the value; and then check if the action is valid. If the condition passes then we update the value and release the lock on the value. Meaning the next caller will be able to lock the value down and run through the same steps.

But how would this approach work with a distributed system? You could suggest using a global data store, but it would have to be one that guarantees consistency (e.g. a service such as AWS' Dynamo DB offers "eventual consistency" and so wouldn't work for a banking institution); but guaranteed consistency is generally considered to be very slow (depending on the number of distributed nodes I assume).

So how do we attempt to solve this design problem?

Bankers Dilemma

Best Answer

For a distributed system, you would either:

a) Use "subtract amount or return error if you can't", where the code responsible for baz returns an error if the result would've been negative (or returns "success" if there wasn't an error)

b) Use the equivalent of locking; where the code responsible for baz has an "acquire baz" and "release baz" that need to be used before and after.

Note that this is typically just the tip of the iceberg. More likely is that you've got 2 or more bank accounts, and want to transfer funds from one to the others such that either all accounts are updated or none are updated. In this case you might (e.g.) end up with a combination.

For example, if there are two accounts "Fred" and "Jane" and you want to transfer $5 from Fred to Jane; then you might end up with a sequence like:

  • From you to Fred's account: "If Fred's account is 5 or greater lock Fred's account and tell me I can proceed, else tell me I can't proceed"

  • From Fred's account to you: "You may proceed"

  • From you to Jane's account: "If Jane's account can be increased by 5 lock Jane's account and tell me I can proceed, else tell me I can't proceed"

  • From Jane's account to you: "You may proceed"

  • From you to Fred's account: "Subtract 5 from Fred's account and release the lock you gave me previously"

  • From you to Jane's account: "Add 5 to Jane's account and release the lock you gave me previously"

Note that for this example; you, Fred's account and Jane's account may all be running on completely different computers communicating with messages/packets (with no shared memory at all).

Related Topic