Why use an opaque “handle” that requires casting in a public API rather than a typesafe struct pointer

api-designclibraries

I'm evaluating a library whose public API currently looks like this:

libengine.h

/* Handle, used for all APIs */
typedef size_t enh;


/* Create new engine instance; result returned in handle */
int en_open(int mode, enh *handle);

/* Start an engine */
int en_start(enh handle);

/* Add a new hook to the engine; hook handle returned in h2 */
int en_add_hook(enh handle, int hooknum, enh *h2);

Note that enh is a generic handle, used as a handle to several different datatypes (engines and hooks).

Internally, most of these APIs of course cast the "handle" to an internal structure which they've malloc'd:

engine.c

struct engine
{
    // ... implementation details ...
};

int en_open(int mode, *enh handle)
{
    struct engine *en;

    en = malloc(sizeof(*en));
    if (!en)
        return -1;

    // ...initialization...

    *handle = (enh)en;
    return 0;
}

int en_start(enh handle)
{
    struct engine *en = (struct engine*)handle;

    return en->start(en);
}

Personally, I hate hiding things behind typedefs, especially when it compromises type safety. (Given an enh, how do I know to what it's actually referring?)

So I submitted a pull request, suggesting the following API change (after modifying the entire library to conform):

libengine.h

struct engine;           /* Forward declaration */
typedef size_t hook_h;    /* Still a handle, for other reasons */


/* Create new engine instance, result returned in en */
int en_open(int mode, struct engine **en);

/* Start an engine */
int en_start(struct engine *en);

/* Add a new hook to the engine; hook handle returned in hh */
int en_add_hook(struct engine *en, int hooknum, hook_h *hh);

Of course, this makes the internal API implementations look a lot better, eliminating casts, and maintaining type safety to/from the consumer's perspective.

libengine.c

struct engine
{
    // ... implementation details ...
};

int en_open(int mode, struct engine **en)
{
    struct engine *_e;

    _e = malloc(sizeof(*_e));
    if (!_e)
        return -1;

    // ...initialization...

    *en = _e;
    return 0;
}

int en_start(struct engine *en)
{
    return en->start(en);
}

I prefer this for the following reasons:

However, the owner of the project balked at the pull request (paraphrased):

Personally I don't like the idea of exposing the struct engine. I still think the current way is cleaner & more friendly.

Initially I used another data type for hook handle, but then decided to switch to use enh, so all kind of handles share the same data type to keep it simple. If this is confusing, we can certainly use another data type.

Let's see what others think about this PR.

This library is currently in a private beta stage, so there isn't much consumer code to worry about (yet). Also, I've obfuscated the names a bit.


How is an opaque handle better than a named, opaque struct?

Note: I asked this question at Code Review, where it was closed.

Best Answer

The "simple is better" mantra has become too much dogmatic. Simple is not always better if it complicates other things. Assembly is simple - each command is much simpler than higher-level languages commands - and yet Assembly programs are more complex than higher-level languages that do the same thing. In your case, the uniform handle type enh makes the types simpler at the cost of making the functions complex. Since usually project's types tend to grow in sub-linear rate compared to it's functions, as the project gets bigger you usually prefer more complex types if it can make the functions simpler - so in this regard your approach seems to be the correct one.

The project's author is concerned that your approach is "exposing the struct engine". I would have explained to them that it's not exposing the struct itself - only the fact that there is a struct named engine. The user of the library already needs to be aware of that type - they need to know, for example, that the first argument of en_add_hook is of that type, and the first argument is of a different type. So it actually makes the API more complex, because instead of having the function's "signature" document these types it needs to be documented somewhere else, and because the compiler can no longer verify the types for the programmer.

One thing that should be noted - your new API makes user code a bit more complex, since instead of writing:

enh en;
en_open(ENGINE_MODE_1, &en);

They now need more complex syntax to declare their handle:

struct engine* en;
en_open(ENGINE_MODE_1, &en);

The solution, however, is quite simple:

struct _engine;
typedef struct _engine* engine

and now you can straightforwardly write:

engine en;
en_open(ENGINE_MODE_1, &en);