C# – Data structure for accessing units of measure

cdata structuresdesignnaming

TL;DR – I'm trying to design an optimal data structure to define units within a unit of measure.


A Unit of measure is essentially a value (or quantity) associated with a unit. SI Units have seven bases or dimensions. Namely: length, mass, time, electric current, temperature, amount of substance (moles), and luminous intensity.

This would be straightforward enough, but there are a number of derived units as well as rates that we frequently use. An example combined unit would be the Newton: kg * m / s^2 and an example rate would be tons / hr.

We have an application that relies heavily upon implied units. We'll embed the units within the variable or column name. But this creates problems when we need to specify a unit of measure with different units. Yes, we can convert the values at input and display but this generates a lot of overhead code that we'd like to encapsulate within its own class.

There are a number of solutions out on codeplex and other collaborative environments. The licensing for the projects is agreeable but the project itself usually ends up being too lightweight or too heavy. We're chasing our own unicorn of "just right."

Ideally, I could define a new unit of measure using something like this:

UOM myUom1 = new UOM(10, volts);
UOM myUom2 = new UOM(43.2, Newtons);

Of course, we use a mix of Imperial and SI units based upon our clients' needs.

We also need to keep this structure of units synchronized with a future database table so we can provide the same degree of consistency within our data too.


What's the best way of defining the units, derived units, and rates that we need to use to create our unit of measure class? I could see using one or more enums, but that could be frustrating for other developers. A single enum would be huge with 200+ entries whereas multiple enums could be confusing based upon SI vs Imperial units and additional breakdown based upon categorization of the unit itself.

Enum examples showing some of my concerns:

myUnits.Volt
myUnits.Newton
myUnits.meter

SIUnit.meter
ImpUnit.foot
DrvdUnit.Newton
DrvdUnitSI.Newton
DrvdUnitImp.FtLbs

Our set of units in use is pretty well defined and it's a finite space. We do need the ability to expand and add new derived units or rates when we have client demand for them. The project is in C# although I think the broader design aspects are applicable to multiple languages.


One of the libraries I looked at allows for free-form input of units via string. Their UOM class then parsed the string and slotted things accordingly. The challenge with this approach is that it forces the developer to think and remember what the correct string formats are. And I run the risk of a runtime error / exception if we don't add additional checks within the code to validate the strings being passed in the constructor.

Another library essentially created too many classes that the developer would have to work with. Along with an equivalent UOM it provided a DerivedUnit and RateUnit and so on. Essentially, the code was overly complex for the problems we're solving. That library would essentially allow any:any combinations (which is legitimate in the units world) but we're happy to scope our issue (simplify our code) by not allowing every possible combination.

Other libraries were ridiculously simple and hadn't even considered operator overloading for example.

In addition, I'm not as worried about attempts at incorrect conversions (for example: volts to meters). Devs are the only ones who will access at this level at this point and we don't necessarily need to protect against those types of mistakes.

Best Answer

The Boost libraries for C++ include an article on dimensional analysis that presents a sample implementation of handling units of measure.

To summarize: Units of measurement are represented as vectors, with each element of the vector representing a fundamental dimension:

typedef int dimension[7]; // m  l  t  ...
dimension const mass      = {1, 0, 0, 0, 0, 0, 0};
dimension const length    = {0, 1, 0, 0, 0, 0, 0};
dimension const time      = {0, 0, 1, 0, 0, 0, 0};

Derived units are combinations of these. For example, force (mass * distance / time^2) would be represented as

dimension const force  = {1, 1, -2, 0, 0, 0, 0};

Imperial versus SI units could be handled by adding a conversion factor.

This implementation relies on C++-specific techniques (using template metaprogramming to easily turn different units of measurement into different compile-time types), but the concepts should transfer to other programming languages.

Related Topic