C++ Volatile Variables – Do I Need to Declare a Delay Timer Variable as Volatile?

ccompileroptimization

This is a question more about using volatile to prevent optimization than about caching write/read of a variable. Particularly timer delay variables since I don't want to declare everything volatile and waste optimization in doing so.

First off, I know the following snippet will not work if optimization is turned on unless I declare i volatile:

int i;
for(i=0; i<LARGE_NUM; i++); //Delay for a bit

What if the loop invoke a function call and wait on that instead, will the compiler look across source files and optimize the call out? I'm asking this because some compilers offer multi-file compilation.

//A.c
#include "B.h"    //B.h has declaration for dec(int)

...
int i;
for(i=LARGE_NUM; i>0; i=dec(i));
...

//B.c

int dec(int a){
  return a-1;
}

What about something a bit more complex and involves hardware interrupt?

//A.c
#include "B.h"

...
timer_start(LARGE_NUM);
while(timer_busy());
...

void hardware_timer_isr(void){    //Hardware timer interrupt
    timer_tick();
}

//B.c

static int time=0;

void timer_start(int t){
    time = t;
}

int timer_busy(void){
    return time>0;
}

void timer_tick(void){
    if(time > 0)
        time--;
}

Will I need to declare the variables as volatile in any of the cases? Or are there anything in particular that I should also look out for?

Best Answer

The compiler operates under the as-if rule that allows any and all code transformations that don't change the observable behavior of the program.

[C++14: 1.5/8]

The least requirements on a conforming implementation are:

Access to volatile objects are evaluated strictly according to the rules of the abstract machine.

At program termination, all data written into files shall be identical to one of the possible results that execution of the program according to the abstract semantics would have produced.

The input and output dynamics of interactive devices shall take place in such a fashion that prompting output is actually delivered before a program waits for input. What constitutes an interactive device is implementation-defined.

These collectively are referred to as the observable behavior of the program.

[C11 5.1.2.3.6 Program execution] has a similar wording:

The least requirements on a conforming implementation are:

Accesses to volatile objects are evaluated strictly according to the rules of the abstract machine.

At program termination, all data written into files shall be identical to the result that execution of the program according to the abstract semantics would have produced.

The input and output dynamics of interactive devices shall take place as specified in 7.21.3. The intent of these requirements is that unbuffered or line-buffered output appear as soon as possible, to ensure that prompting messages actually appear prior to a program waiting for input.

This is the observable behavior of the program.

A delay is not considered an observable behavior and the first example can be "optimized" to an empty program.

Note that, to allow compiler transformations such as removal of empty loops (even when termination cannot be proven), C++14 standards says:

[C++14: 1.10/24]

The implementation may assume that any thread will eventually do one of the following:

terminate,

make a call to a library I/O function,

access or modify a volatile object, or

perform a synchronization operation or an atomic operation.

[ Note: This is intended to allow compiler transformations such as removal of empty loops, even when termination cannot be proven. —end note ]

The second example is harder for the compiler because it is usually unable to analyze the code of an external library to determine whether it does / doesn't perform I/O or volatile access. However statically-linked third-party library code may be subject to link-time optimization so a multi-file organization isn't an insurmountable barrier.

The third example doesn't introduce anything new:

the interrupt vector entry is initialized at program startup
that involves taking address of the handler function and it's sufficient to protect hardware_timer_isr from being optimized out
but the time variable isn't manually designated as a variable that can be changed by interrupt handlers so the instructions:
```
 timer_start(LARGE_NUM);
 while(timer_busy());
```
haven't an observable behavior and can be optimized out (see Why does the compiler not optimize away interrupt code? for further details).

If you need a delay you can use std::sleep_for or std::sleep_until.

FURTHER NOTES

What if I deliberately put a while(1); or a similarly obfuscated infinite loop to intentionally halt the program? According to C++14 1.10/24 above, even though the termination can not be proven, the loop itself does not change the observable behavior of the program and thus can be legally removed, right?

The reason of the note ("this is intended to allow compiler transformations such as removal of empty loops, even when termination cannot be proven") is there's no way to detect infinite loops universally and the inability to prove termination hampers compilers which could otherwise make useful transformations (there is a good example in N1528: Why undefined behavior for infinite loops?).

For the C++ language the situation is described in:

Optimizing away a “while(1);” in C++0x by Hans Boehm
Compilers and Termination Revisited by John Regehr

For the C language C11 provides an exception for controlling expressions that are constant expressions.

Related Solutions

C++ Type Safety – Understanding Strongly Typed Typedef

These are phantom type parameters, that is, parameters of a parameterised type that are used not for their representation, but to separate different “spaces” of types with the same representation.

And speaking of spaces, that’s a useful application of phantom types:

template<typename Space>
struct Point { double x, y; };

struct WorldSpace;
struct ScreenSpace;

// Conversions between coordinate spaces are explicit.
Point<ScreenSpace> project(Point<WorldSpace> p, const Camera& c) { … }

As you’ve seen, though, there are some difficulties with unit types. One thing you can do is decompose units into a vector of integer exponents on the fundamental components:

template<typename T, int Meters, int Seconds>
struct Unit {
  Unit(const T& value) : value(value) {}
  T value;
};

template<typename T, int MA, int MB, int SA, int SB>
Unit<T, MA - MB, SA - SB>
operator/(const Unit<T, MA, SA>& a, const Unit<T, MB, SB>& b) {
  return a.value / b.value;
}

Unit<double, 0, 0> one(1);
Unit<double, 1, 0> one_meter(1);
Unit<double, 0, 1> one_second(1);

// Unit<double, 1, -1>
auto one_meter_per_second = one_meter / one_second;

Here we’re using phantom values to tag runtime values with compile-time information about the exponents on the units involved. This scales better than making separate structures for velocities, distances, and so on, and might be enough to cover your use case.

Can a pimpl variation be implemented without any performance penalty

The selling points of the Pimpl pattern are:

total encapsulation: there are no (private) data members mentioned in the header file of the interface object.
stability: until you break the public interface (which in C++ includes private members), you'll never have to recompile code that depends on the interface object. This makes the Pimpl a great pattern for libraries that don't want their users to recompile all code on every internal change.
polymorphism and dependency injection: the implementation or behaviour of the interface object can be easily swapped out at runtime, without requiring dependent code to be recompiled. Great if you need to mock something for an unit test.

To this effect, the classic Pimpl consists of three parts:

An interface for the implementation object, which must be public, and use virtual methods for the interface:
```
class IFrobnicateImpl
{
public:
    virtual int frobnicate(int) const = 0;
};
```
This interface is required to be stable.

An interface object that proxies to the private implementation. It does not have to use virtual methods. The only allowed member is a pointer to the implementation:

class Frobnicate
{
    std::unique_ptr<IFrobnicateImpl> _impl;
public:
    explicit Frobnicate(std::unique_ptr<IFrobnicateImpl>&& impl = nullptr);
    int frobnicate(int x) const { return _impl->frobnicate(x); }
};

...

Frobnicate::Frobnicate(std::unique_ptr<IFrobnicateImpl>&& impl /* = nullptr */)
: _impl(std::move(impl))
{
    if (!_impl)
        _impl = std::make_unique<DefaultImplementation>();
}

The header file of this class must be stable.

At least one implementation

The Pimpl then buys us a great deal of stability for a library class, at the cost of one heap allocation and additional virtual dispatch.

How does your solution measure up?

It does away with encapsulation. Since your members are protected, any subclass can mess with them.
It does away with interface stability. Whenever you change your data members – and that change is just one refactoring away – you'll have to recompile all dependent code.
It does away with the virtual dispatch layer, preventing easy swapping of the implementation.

So for every objective of the Pimpl pattern, you fail to fulfil this objective. It is therefore not reasonable to call your pattern a variation of the Pimpl, it is much more an ordinary class. Actually, it's worse than an ordinary class because your member variables are private. And because of that cast which is a glaring point of fragility.

Note that the Pimpl pattern is not always optimal – there's a tradeoff between stability and polymorphism on the one hand, and memory compactness on the other. It is semantically impossible for a language to have both (without JIT compilation). So if you're micro-optimizing for memory compactness, clearly the Pimpl is not a suitable solution for your use case. You'll also probably stop using half the standard library, since these awful string and vector classes involve dynamic memory allocations ;-)

Best Answer

Related Solutions

C++ Type Safety – Understanding Strongly Typed Typedef

Can a pimpl variation be implemented without any performance penalty

Related Topic