C++ Coding Standards – Does a Long Ban Make Sense?

64-bitccoding-standardsdata types

In today's cross-platform C++ (or C) world we have:

Data model  | short |   int |   long | long long | pointers/size_t  | Sample operating systems
... 
LLP64/IL32P64   16      32      32     64           64                Microsoft Windows (x86-64 and IA-64)
LP64/I32LP64    16      32      64     64           64                Most Unix and Unix-like systems, e.g. Solaris, Linux, BSD, and OS X; z/OS
...

What this means today, is that for any "common" (signed) integer, int will suffice and can possibly still be used as the default integer type when writing C++ application code. It will also – for current practical purposes – have a consistent size across platforms.

Iff a use case requires at least 64 bits, we can today use long long, though possibly using one of the bitness-specifying types or the __int64type might make more sense.

This leaves longin the middle, and we're considering outright banning the use of long from our application code.

Would this make sense, or is there a case for using long in modern C++ (or C) code that has to run cross platform? (platform being desktop, mobile devices, but not things like microcontrollers, DSPs etc.)

Possibly interesting background links:

What does the C++ standard state the size of int, long type to be?
Why did the Win64 team choose the LLP64 model?
64-Bit Programming Models: Why LP64? (somewhat aged)
Is long guaranteed to be at least 32 bits? (This addresses the comment discussion below. Answer.)

Best Answer

The only reason I would use long today is when calling or implementing an external interface that uses it.

As you say in your post short and int have reasonably stable characteristics across all major desktop/server/mobile platforms today and I see no reason for that to change in the foreseeable future. So I see little reason to avoid them in general.

long on the other hand is a mess. On all 32-bit systems I'm aware of it had the following characteristics.

It was exactly 32-bits in size.
It was the same size as a memory address.
It was the same size as the largest unit of data that could be held in a normal register and work on with a single instruction.

Large amounts of code was written based on one or more of these characteristics. However with the move to 64-bit it was not possible to preserve all of them. Unix-like platforms went for LP64 which preserved characteristics 2 and 3 at the cost of characteristic 1. Win64 went for LLP64 which preserved characteristic 1 at the cost of characteristics 2 and 3. The result is you can no longer rely on any of those characteristics and that IMO leaves little reason to use long.

If you want a type that is exactly 32-bits in size you should use int32_t.

If you want a type that is the same size as a pointer you should use intptr_t (or better uintptr_t).

If you want a type that is the largest item that can be worked on in a single register/instruction then unfortunately I don't think the standard provides one. size_t should be right on most common platforms but it wouldn't be on x32.

P.S.

I wouldn't bother with the "fast" or "least" types. The "least" types only matter if you care about portablility to really obscure architectures where CHAR_BIT != 8. The size of the "fast" types in practice seems to be pretty arbitary. Linux seems to make them at least the same size as pointer, which is silly on 64-bit platforms with fast 32-bit support like x86-64 and arm64. IIRC iOS makes them as small as possible. I'm not sure what other systems do.

P.P.S

One reason to use unsigned long (but not plain long) is because it is gauranteed to have modulo behaviour. Unfortunately due to C's screwed up promotion rules unsigned types smaller than int do not have modulo behaviour.

On all major platforms today uint32_t is the same size or larger than int and hence has modulo behaviour. However there have been historically and there could theoretically be in the future platforms where int is 64-bit and hence uint32_t does not have modulo behaviour.

Personally I would say it's better to get in the habbit of forcing modulo behaviour by using "1u *" or "0u +" at the start of your equations as this will work for any size of unsigned type.

Best Answer

Related Solutions

Legacy Code – Does Adding Unit Tests Make Sense?

C++ – Does It Make Sense to Write Build Scripts in C++?

Related Topic