C++ – C/C++ How Does Dynamic Linking Work On Different Platforms

ccompilationdynamic-linkingloadlibrary

How does dynamic linking work generally?

On Windows (LoadLibrary), you need a .dll to call at runtime, but at link time, you need to provide a corresponding .lib file or the program won't link… What does the .lib file contain? A description of the .dll methods? Isn't that what the headers contain?

Relatedly, on *nix, you don't need a lib file… How how does the compiler know that the methods described in the header will be available at runtime?

As a newbie, when you think about either one of the two schemes, then the other, neither of them make sense…

Best Answer

To answer your questions one by one:

Dynamic linking defers part of the linking process to runtime. It can be used in two ways: implicitly and explicitly. Implicitly, the static linker will insert information into the executable which will cause the library to load and resolve the necessary symbols. Explicitly, you must call LoadLibrary or dlopen manually, and then GetProcAddress/dlsym for each symbol you need to use. Implicit loading is used for things like the system library, where the implementation will depend on the version of the system, but the interface is guaranteed. Explicit loading is used for things like plug-ins, where the library to be loaded will be determined at runtime.
The .lib file is only necessary for implicit loading. It contains the information that the library actually provides this symbol, so the linker won't complain that the symbol is undefined, and it tells the linker in what library the symbols are located, so it can insert the necessary information to cause this library to automatically be loaded. All the header files tell the compiler is that the symbols will exist, somewhere; the linker needs the .lib to know where.
Under Unix, all of the information is extracted from the .so. Why Windows requires two separate files, rather than putting all of the information in one file, I don't know; it's actually duplicating most of the information, since the information needed in the .lib is also needed in the .dll. (Perhaps licensing issues. You can distribute your program with the .dll, but no one can link against the libraries unless they have a .lib.)

The main thing to retain is that if you want implicit loading, you have to provide the linker with the appropriate information, either with a .lib or a .so file, so that it can insert that information into the executable. And that if you want explicit loading, you can't refer to any of the symbols in the library directly; you have to call GetProcAddress/dlsym to get their addresses yourself (and do some funny casting to use them).

Function pointers in C

Let's start with a basic function which we will be pointing to:

int addInt(int n, int m) {
    return n+m;
}

First thing, let's define a pointer to a function which receives 2 ints and returns an int:

int (*functionPtr)(int,int);

Now we can safely point to our function:

functionPtr = &addInt;

Now that we have a pointer to the function, let's use it:

int sum = (*functionPtr)(2, 3); // sum == 5

Passing the pointer to another function is basically the same:

int add2to3(int (*functionPtr)(int, int)) {
    return (*functionPtr)(2, 3);
}

We can use function pointers in return values as well (try to keep up, it gets messy):

// this is a function called functionFactory which receives parameter n
// and returns a pointer to another function which receives two ints
// and it returns another int
int (*functionFactory(int n))(int, int) {
    printf("Got parameter %d", n);
    int (*functionPtr)(int,int) = &addInt;
    return functionPtr;
}

But it's much nicer to use a typedef:

typedef int (*myFuncDef)(int, int);
// note that the typedef name is indeed myFuncDef

myFuncDef functionFactory(int n) {
    printf("Got parameter %d", n);
    myFuncDef functionPtr = &addInt;
    return functionPtr;
}

C++ – Static linking vs dynamic linking

Dynamic linking can reduce total resource consumption (if more than one process shares the same library (including the version in "the same", of course)). I believe this is the argument that drives it its presence in most environments. Here "resources" includes disk space, RAM, and cache space. Of course, if your dynamic linker is insufficiently flexible there is a risk of DLL hell.
Dynamic linking means that bug fixes and upgrades to libraries propagate to improve your product without requiring you to ship anything.
Plugins always call for dynamic linking.
Static linking, means that you can know the code will run in very limited environments (early in the boot process, or in rescue mode).
Static linking can make binaries easier to distribute to diverse user environments (at the cost of sending a larger and more resource hungry program).
Static linking may allow slightly faster startup times, but this depends to some degree on both the size and complexity of your program and on the details of the OS's loading strategy.

Some edits to include the very relevant suggestions in the comments and in other answers. I'd like to note that the way you break on this depends a lot on what environment you plan to run in. Minimal embedded systems may not have enough resources to support dynamic linking. Slightly larger small systems may well support dynamic linking, because their memory is small enough to make the RAM savings from dynamic linking very attractive. Full blown consumer PCs have, as Mark notes, enormous resources, and you can probably let the convenience issues drive your thinking on this matter.

To address the performance and efficiency issues: it depends.

Classically, dynamic libraries require a some kind of glue layer which often means double dispatch or an extra layer of indirection in function addressing and can cost a little speed (but is function calling time actually a big part of your running time???).

However, if you are running multiple processes which all call the same library a lot, you can end up saving cache lines (and thus winning on running performance) when using dynamic linking relative to using static linking. (Unless modern OS's are smart enough to notice identical segments in statically linked binaries. Seems hard, anyone know?)

Another issue: loading time. You pay loading costs at some point. When you pay this cost depends on how the OS works as well as what linking you use. Maybe you'd rather put off paying it until you know you need it.

Note that static-vs-dynamic linking is traditionally not an optimization issue, because they both involve separate compilation down to object files. However, this is not required: a compiler can in principle, "compile" "static libraries" to a digested AST form initially, and "link" them by adding those ASTs to the ones generated for the main code, thus empowering global optimization. None of the systems I use do this, so I can't comment on how well it works.

The way to answer performance questions is always by testing (and use a test environment as much like the deployment environment as possible).

Best Answer

Related Solutions

How do function pointers in C work

Function pointers in C

C++ – Static linking vs dynamic linking

Related Topic