C++ – Statistics Collection Engine for Systems

cdesignobject-orientedstatistics

We have a research project with idea->prototype->statistics development cycle.
Anyway, our final product is a prototype, so the statistics collection suite is
not used persistently. Supposing I have following class:

class Transform {
    private:
        someData;

    public:
        transformForward (inp_block, outp_block);
        transformBackward (inp_block, outp_block);
};

Imagine that it is a part of a big system. I need to periodically collect some statistics on such transform (internal data could be considered as well). It seems that adding statistical routines would be a violation of single-responsibility principle. The second smell is that I do not want to touch code that uses Transform instances to explicitly invoke these routines. I would be best if I could just
trigger some kind of switch so that the statistics for that module will be collected.

I've met that challenge a number of times and I have a feeling that I'm constantly reinventing the wheel. Is there some good practices for configuring and collecting the statistics suite for a compound system without interfering into it's internal code base?

UPDATE:

As I can see from the answers proposed, my question is too non-specific, so I'll provide more concrete example.

Consider an image-compression system composed by two huge blocks: Predictor and Encoder. There are a lot of various prediction and compression algorithms, during our research we need to explore the behavior of the components under various conditions. We should answer questions like "how many times the pixel is processed within each context", "how well does this predictor works", "how does each predictor affects the Encoder's internal state" and many others.

Anyway, our final product is just a Codec with no statistical suite shipped with it; all kind of statistics collection is used internally during our research. Thus the question arises: how could one build flexible statistics engine that knows the very internals of the system? How could one keep the system itself independent of the statistics engine?

Best Answer

The Observer pattern might be a good fit here. The Transform class defines a set of events that might be of interest to the statistics engine, and the statistics engine registers itself with the relevant Transform instance to gather the statistics on that transformation (or the statistics that include that transformation).

Update

As stated in a comment, the basic problem is how does the statistics engine know that something of interest has happened. You could execute the codec in a virtual machine to keep track of everything (valgrind uses this approach to check memory access), but then you have the problem of deciding what it means that the codec accessed address 0x12345678.

All other methods of statistics gathering invade into the code-base of the codec in one way or another. The least invasive is probably to add copious amounts of logging and to let the statistics engine analyse that. All logging packages also provide means to disable the generation of the logs with minimal cost, sometimes even compiling those statements to no-ops.

Related Topic