Encryption – Handling Encryption Key Conflicts When Synchronizing Data

dataencryptionusability

Assume that there is data that gets synchronized between several devices. The data is protected with a symmetric encryption algorithm and a key. The key is stored on each device and encrypted with a password. When a user changes the password only the key gets re-encrypted.

Under normal circumstances, when there is a good network connection to other peers, the current key gets synchronized and all data on the new device gets encrypted with the same key.

But how to handle situations where a new device doesn’t have a network connection and e.g. creates its own new, but incompatible key? How to keep the usability as high as possible under such circumstances?

  1. The application could detect that there is no network and hence refuse to start. That’s very bad usability in my opinion, because the application isn’t functional at all in this case. I don’t consider this a solution.

  2. The application could ignore the missing network connection and create a new key. But what to do when the application gains a network connection? There will be several incompatible keys and some parts of the underlying data could only be encrypted with one key and other parts with another key.

  3. The situation would get worse if there would be more keys than just two and the application would’ve to ask every time for a password when another object that should get decrypted with another key would be needed.

  4. It is very messy and time consuming to try to re-encrypt all data that is encrypted with another key with a main key. What should be the main key at all in this case? The oldest key? The key with the most encrypted objects? What if the key got synchronized but not all objects that got encrypted with this particular key? How should the user know for which particular password the application asks and why it takes probably very long to re-encrypt the data? It’s very hard to describe encryption “issues” to users.

So far I didn’t find an acceptable solution, nor some kind of generic strategy. Do you have some hints about a concrete strategy or some books / papers that describe synchronization of symmetrically encrypted data with keys that could cause conflicts?

Best Answer

While I don't know of any books/papers that discuss this exact problem, it seems to me that any solution to "the synchronization problem", paired with any solution to "the avoid-re-encrypting-file-with-new-key problem", should solve your original problem. Each of those sub-problems have several solutions.

The synchronization problem

You have one "common file" (in this case, a symmetric key) that, ideally, you want to be the same across all devices. However, for one reason or another, the data is somehow different from one device to the next -- split-brain syndrome -- and you want all the devices connected to the network to somehow reach a consensus as to whether to use version A from now on, or use version B from now on, or perhaps some entirely new version C from now on.

There are three popular approaches:

  • Restructure the application to give people the functionality they want, without ever having such a "master key". In this case, the standard approach is to use public-key systems that let every device generate its own unique private key, then generate the public key from the private key, then (somehow) distribute the public keys.
  • Use some sort of quorum protocol to come to consensus.
  • Somehow time-tag each version, and when any node discovers that there are two versions, it picks the newest version. (I suppose it could pick the oldest version, but that makes it difficult to upgrade).

One of many possible solutions goes like this:

  • If the common file does not already exist locally, and I can't seem to connect to any other devices and download the one they are using, go ahead and create a new version of the common file.
  • Later, when connected to the network and we see other devices, somehow (?) download the version of the common file of each of those other connected devices. If there is at least a quorum of other devices (say, 2 other devices), then start using the version of the common file used by most connected devices (the plurality). If there is a tie among the top N versions of this file, then pick one of those top N versions at random and start using it.

In particular, if every device has a different version of this file, then the "birthday problem" practically guarantees that, after enough iterations of this algorithm, eventually 2 devices will pick the same version of the file, and eventually all the online devices will converge on the same version of the file.

The avoid-re-encrypting-file-with-new-key problem

All problems in computer science can be solved by another level of indirection. But that usually will create another problem. -- attributed to David Wheeler in the book Beautiful Code (2007)

As I understand it,

  • You have (encrypted) data that is synchronized between multiple devices
  • You want to allow a person to change the passphrase on a device to some other passphrase of his own choosing, but you want this to be more-or-less "instant" rather than taking many minutes to decrypt and re-encrypt all the data.
  • Each device has its own passphrase that can be used to access the data on the device
  • You don't want unauthorized people to be able to decrypt the data (and "has the passphrase for this device" is an adequate proxy for "is authorized to decrypt the data on this device").
  • You want to allow a person to create new encrypted data files before the device is ever connected to a network -- and therefore before the devices knows any shared encryption keys -- and later when the device is connected to the network, those files are synchronized to other devices where other people can decrypt and read those files.

The standard way of doing that is to store the data in OpenPGP format (as standardized in RFC 4880). a b c d e

You already have one layer of indirection -- a person types a passphrase, which is used to decrypt the device-specific password.

The OpenPGP process uses a second layer of indirection: Every file is encrypted with its own unique symmetric key.

It works something like this:

Every time new data is created or edited, a completely new symmetric key is generated, the new key itself is encrypted with the user's public key and that encrypted key is stored in the header of the encrypted file. The data is encrypted with that new symmetric key and stored afterward in that encrypted file. (This can all be done before the device ever connects to the network).

Later that encrypted file is synchronized unmodified over the network. (Except the sender somehow obtains the receiver's device-specific key, encrypts the file-specific symmetric key with the receiver's key, and then adds that encrypted key to the file header).

To decrypt that file and read the data,

  • A person types in the device-specific passphrase
  • The device uses that passphrase to decode a file containing the device-specific key. (This is exactly what you are doing already).
  • The device pulls the encrypted file-specific key out of the header of the file, and uses the device-specific key to decrypt the file-specific key. (This is a second layer of indirection).
  • Then the device uses the file-specific key to decrypt the data in the file.

To make the system easier to change/migrate,

  • Use an encrypted file format (such as OpenPGP) that specifies exactly which encryption algorithm was used for this particular file. That allows future software to detect which encryption algorithm was used to create a particular file. Then the device can decrypt today's shiny new files using today's shiny new preferred algorithm. The device can also decrypt dusty old files with yesterday's dusty old algorithms -- and optionally re-encrypt using today's shiny new preferred algorithm.

  • Use an encrypted file format (such as OpenPGP) that allows you to store the particular file-specific symmetric key in the header several times, each time encrypted with a different public key or device-specific key.

When a user changes the passphrase, only the device-specific key gets re-encrypted, just like what you are doing already.

If for any reason the device-specific key needs to change, then the device must re-encrypt the file-specific key in the header of each and every encrypted file it holds. That's probably faster than decrypting and re-encrypting the entire file.

Have you considered using some off-the-shelf implementation of OpenPGP, such as "Pretty Good Privacy" or "GNU Privacy Guard"?

Related Topic