JSON Formatting – How to Represent a Set in JSON

data structuresformattingjsonnotation

JSON supports the following data structures (Java equivalents): Scalar, Array/List, and Map.

A Set is not supported out-of-the-box in JSON.

I thought about several ways to represent a set in JSON:

[1] – As a list

However, a list has its own ordering, so the following two lists, ["a", "b"] and ["b", "a"] are not equal as lists, but they should be equal as sets.

[2] – As a map

Use the key-set of the map, and ignore the values.

But again, using standard comparison, the two are not the same as maps:

{"a": "foo", "b": "bar"}, {"a": null, "b": null}

[3] – As a map, with a special value

Take a scalar, say 0 or null and force it to be the value of every key in the map:

{"a": 0, "b": 0}

This way, under standard comparison tools, the objects are equal, even if the key ordering is changed.

However, this technique pollutes the JSON document with irrelevant data.

[4] – As an ordered list

Back to the first suggestion, but this time as an ordered list. This kind of solves the comparison issue.

However, we should also put in mind the complexity of sorting, and also that map notation handles duplicates, while a sorted list does not. Example:

{"a": 400, "a": 9} is handled as {"a": 9}, but ["g", "g"] would always be ["g", "g"].

Having said all that, it seems to me that the list notation is clearer, but the map notation is more robust to keys duplication, and make it harder to be consistent about the special value (even though null seems like a good choice for that).

What do you think? How would you represent a set in JSON?

P.S.

Note that question this is merely about JSON. I know that other formats, like yaml, are available. Still…

Best Answer

Well, you can't. As you said, you can represent arrays and dictionaries. You have two choices.

Represent the set as an array. Advantage: Converting from set to array and back is usually easy. Disadvantage: An array has an implied order, which a set doesn't, so converting identical sets to JSON arrays can create arrays that would be considered different. There is no way to enforce that array elements are unique, so a JSON array might not contain a valid set (obviously you could just ignore the duplicates; that's what is likely to happen anyway).

Represent the set as a dictionary, with an arbitrary value per key, for example 0 or null. If you just ignore the values, this is a perfect match. On the other hand, you may have no library support for extracting the keys of a dictionary as a set, or for turning a set into a dictionary.

In my programming environment, conversion between set and array is easier (array to set will lose duplicate values, which either shouldn't be there, or would be considered correct), so for that reason I would go with arrays. But that is very much a matter of opinion.

BUT: There is a big fat elephant in the room that hasn't been mentioned. The keys in a JSON dictionary can only be strings. If your set isn't a set of strings, then you only have the choice of using an array.

Related Topic