While I don't know of any books/papers that discuss this exact problem,
it seems to me that
any solution to "the synchronization problem",
paired with any solution to "the avoid-re-encrypting-file-with-new-key problem",
should solve your original problem.
Each of those sub-problems have several solutions.
The synchronization problem
You have one "common file" (in this case, a symmetric key)
that, ideally, you want to be the same across all devices.
However, for one reason or another,
the data is somehow different from one device to the next --
split-brain syndrome --
and you want all the devices connected to the network to
somehow reach a consensus as to whether to use version A from now on,
or use version B from now on, or perhaps some entirely new version C from now on.
There are three popular approaches:
- Restructure the application to give people the functionality they want, without ever having such a "master key". In this case, the standard approach is to use public-key systems that let every device generate its own unique private key, then generate the public key from the private key, then (somehow) distribute the public keys.
- Use some sort of quorum protocol to come to consensus.
- Somehow time-tag each version, and when any node discovers that there are two versions, it picks the newest version. (I suppose it could pick the oldest version, but that makes it difficult to upgrade).
One of many possible solutions goes like this:
- If the common file does not already exist locally, and I can't seem to connect to any other devices and download the one they are using, go ahead and create a new version of the common file.
- Later, when connected to the network and we see other devices, somehow (?) download the version of the common file of each of those other connected devices. If there is at least a quorum of other devices (say, 2 other devices), then start using the version of the common file used by most connected devices (the plurality). If there is a tie among the top N versions of this file, then pick one of those top N versions at random and start using it.
In particular, if every device has a different version of this file,
then the "birthday problem" practically guarantees that, after enough iterations of this algorithm, eventually 2 devices will pick the same version of the file,
and eventually all the online devices will converge on the same version of the file.
The avoid-re-encrypting-file-with-new-key problem
All problems in computer science can be solved by another level of
indirection. But that usually will create another problem. --
attributed to David Wheeler in the book Beautiful Code (2007)
As I understand it,
- You have (encrypted) data that is synchronized between multiple devices
- You want to allow a person to change the passphrase on a device to some other passphrase of his own choosing, but you want this to be more-or-less "instant" rather than taking many minutes to decrypt and re-encrypt all the data.
- Each device has its own passphrase that can be used to access the data on the device
- You don't want unauthorized people to be able to decrypt the data (and "has the passphrase for this device" is an adequate proxy for "is authorized to decrypt the data on this device").
- You want to allow a person to create new encrypted data files before the device is ever connected to a network -- and therefore before the devices knows any shared encryption keys -- and later when the device is connected to the network, those files are synchronized to other devices where other people can decrypt and read those files.
The standard way of doing that is to store the data in OpenPGP format (as standardized in RFC 4880).
a b c d e
You already have one layer of indirection -- a person types a passphrase, which is used to decrypt the device-specific password.
The OpenPGP process uses a second layer of indirection:
Every file is encrypted with its own unique symmetric key.
It works something like this:
Every time new data is created or edited, a completely new symmetric key is generated,
the new key itself is encrypted with the user's public key and that encrypted key is stored in the header of the encrypted file. The data is encrypted with that new symmetric key and stored afterward in that encrypted file.
(This can all be done before the device ever connects to the network).
Later that encrypted file is synchronized unmodified over the network.
(Except the sender somehow obtains the receiver's device-specific key,
encrypts the file-specific symmetric key with the receiver's key,
and then adds that encrypted key to the file header).
To decrypt that file and read the data,
- A person types in the device-specific passphrase
- The device uses that passphrase to decode a file containing the device-specific key. (This is exactly what you are doing already).
- The device pulls the encrypted file-specific key out of the header of the file, and uses the device-specific key to decrypt the file-specific key. (This is a second layer of indirection).
- Then the device uses the file-specific key to decrypt the data in the file.
To make the system easier to change/migrate,
Use an encrypted file format (such as OpenPGP) that specifies exactly which encryption algorithm was used for this particular file. That allows future software to detect which encryption algorithm was used to create a particular file. Then the device can decrypt today's shiny new files using today's shiny new preferred algorithm. The device can also decrypt dusty old files with yesterday's dusty old algorithms -- and optionally re-encrypt using today's shiny new preferred algorithm.
Use an encrypted file format (such as OpenPGP) that allows you to store the particular file-specific symmetric key in the header several times, each time encrypted with a different public key or device-specific key.
When a user changes the passphrase, only the device-specific key gets re-encrypted, just like what you are doing already.
If for any reason the device-specific key needs to change,
then the device must re-encrypt the file-specific key in the header of each and every encrypted file it holds. That's probably faster than decrypting and re-encrypting the entire file.
Have you considered using some off-the-shelf implementation of OpenPGP, such as "Pretty Good Privacy" or "GNU Privacy Guard"?
I would personally check the laws on this. If the data needs to be encrypted, then it needs to be encrypted.
If you don't receive any guidance though, I would aim to protect the link between the patient, and their data. I.e. you most likely have a PatientID
that's used in tables throughout the database. PatientID
does not identify a patient, only the patient's medical history etc... However, to identify the PatientID
as Joe Bloggs living at Rua de São Bernardo Lisbon, I'd keep this in a separate DB if I can. Use TDE for the patient's personal details and consider encrypting it on-top of that using keys in your web application.
Whilst theft of that medical data without the means to identify the patients will be extremely embarrassing, it is unlikely to be anything beyond that. There are literally online competitions that use this anonymised medical data.
With the separation of the medical data from the patient's personal details. Use a robust set of roles to limit staff to only what they need. With the exception of medical staff that require to deal with the patient directly (front line nurses & doctors), no one should have access to both. Receptionists only need Patient's personal details, lab staff only need the medical record and PatientID, surgical nurses only currently medical condition and first name.
When you've identified each set of roles, aim to not only implement them in your web application, but also in the database as well as an extra layer of security.
Best Answer
I'm going to start off with a slightly dickish response, namely, if you are uncertain about this stuff you should focus on finding a (reputable) library that will handle these sorts of things for you. I'm not all that familiar with Unity, so if such a library exists, I hope someone else can provide an answer pointing to it. As I'm sure you are aware, even major corporations mess these kinds of things up.
That said, I don't really understand from whom you are trying to protect this data. (I also don't understand why this would be sensitive but that's not important.) As far as I can tell the PlayerPrefs are stored locally so the kinds of attackers you are considering are 1) other applications on the device, 2) other users on the device, 3) attackers with physical access to the persistent memory e.g. the SD card. Most of these are already protected against via operating system mechanisms. But to be completely clear, doing encryption yourself will increase protection beyond what the operating system is doing. The argument comes down defining a threat model and doing the cost-benefit analysis of mitigating the threats. The cost-benefit analysis will include the development and maintenance costs for you, and the key management costs for the user, e.g. if the user loses their key, they lose their data and nothing can be done about it.
Addressing your second to last paragraph, if you just stored the symmetric key with the encrypted data you would have no security. You might as well store the unencrypted data.
Your four step scheme should work (though there are details you need to get right), but it seems to be overkill. If the symmetric key is encrypted, then I'll need to have access to the private key to decrypt it. If I can securely store the private key, why don't I just store the symmetric key where I store the private key? Alternatively, unless it is a large amount of data, why not just encrypt the data directly with the private key? Or even, if I can store keys securely, why don't I just store the data there?
I only see something like your four step procedure being useful if the asymmetric keys are managed by the OS but no symmetric key management is provided, and you need to encrypt large amounts of data. In fact, decent crypto libraries will usually handle automatically generating a symmetric key and storing it encrypted with the ciphertext so you don't have to worry about it and you don't need to worry about size limitations or performance.