Nick, I think all the other answers are actually answering your question, which is how you link libraries, but the way you phrase your question suggests you have a misunderstanding of the difference between headers files and libraries. They are not the same. You need both, and they are not doing the same thing.
Building an executable has two main phases, compilation (which turns your source into an intermediate form, containing executable binary instructions, but is not a runnable program), and linking (which combines these intermediate files into a single running executable or library).
When you do gcc -c program.c
, you are compiling, and you generate program.o
. This step is where headers matter. You need to #include <stdlib.h>
in program.c
to (for example) use malloc
and free
. (Similarly you need #include <dlfcn.h>
for dlopen
and dlsym
.) If you don't do that the compiler will complain that it doesn't know what those names are, and halt with an error. But if you do #include
the header the compiler does not insert the code for the function you call into program.o
. It merely inserts a reference to them. The reason is to avoid duplication of code: The code is only going to need to be accessed once by every part of your program, so if you needed further files (module1.c
, module2.c
and so on), even if they all used malloc
you would merely end up with many references to a single copy of malloc
. That single copy is present in the standard library in either it's shared or static form (libc.so
or libc.a
) but these are not referenced in your source, and the compiler is not aware of them.
The linker is. In the linking phase you do gcc -o program program.o
. The linker will then search all libraries you pass it on the command line and find the single definition of all functions you've called which are not defined in your own code. That is what the -l
does (as the others have explained): tell the linker the list of libraries you need to use. Their names often have little to do with the headers you used in the previous step. For example to get use of dlsym
you need libdl.so
or libdl.a
, so your command-line would be gcc -o program program.o -ldl
. To use malloc
or most of the functions in the std*.h
headers you need libc
, but because that library is used by every C program it is automatically linked (as if you had done -lc
).
Sorry if I'm going into a lot of detail but if you don't know the difference you will want to. It's very hard to make sense of how C compilation works if you don't.
One last thing: dlopen
and dlsym
are not the normal method of linking. They are used for special cases where you want to dynamically determine what behavior you want based on information that is, for whatever reason, only available at runtime. If you know what functions you want to call at compile time (true in 99% of the cases) you do not need to use the dl*
functions.
It's been a while since I used ATL but, IIRC, what ends up being instantiated is not CMyClass
, but CComObject<CMyClass>
.
CComObject
implements IUnknown
and inherits from its template parameter.
Edit: The "Fundamentals of ATL COM Objects" page on MSDN nicely illustrates what's going on.
Best Answer
I'm looking to implement a custom implementation of COM in C++ on a UNIX type platform to allow me to dynamically load and link object oriented code. I'm thinking this would be based on a similar set of functionality that POSIX provides to load and call dll's ie dlopen, dlsym and dlclose.
At its simplest level, COM is implemented with interfaces. In c++, if you are comfortable with the idea of pure virtual, or abstract base classes, then you already know how to define an interface in c++
The COM runtime provides a lot of extra services that apply to the windows environment but arn't really needed when implementing "mini" COM in a single application as a means to dynamically link to a more OO interface than traditionally allowed by dlopen, dlsym, etc.
COM objects are implemented in .dll, .so or .dylib files depending on your platform. These files need to export at least one function that is standardized: DllGetClassObject
In your own environment you can prototype it however you want but to interop with the COM runtime on windows obviously the name and parameters need to conform to the com standard.
The basic idea is, this is passed a pointer to a GUID - 16 bytes that uniquely are assigned to a particular object, and it creates (based on the GUID) and returns the IClassFactory* of a factory object.
The factory object is then used, by the COM runtime, to create instances of the object when the IClassFactory::CreateInstance method is called.
So, so far you have
I understand that the general idea of COM is that you link to a few functions ie QueryInterface, AddRef and Release in a common dll (Kernel32.dll) which then allows you to access interfaces which are just a table of function pointers encapsulated with a pointer to the object for which the function pointers should be called with. These functions are exposed through IUnknown which you must inherit off of.
Actually, COM is implemented by OLE32.dll which exposes a "c" api called CoCreateInstance. The app passed CoCreateInstance a GUID, which it looks up in the windows registry - which has a DB of GUID -> "path to dll" mappings. OLE/COM then loads (dlopen) the dll, calls its DllGetClassObject (dlsym) method, passing in the GUID again, presuming that succeeds, OLE/COM then calls the CreateInstance and returns the resulting interface to app.
So how does this all work? Is there a better way to dynamically link and load to object oriented code? How does inheritance from a dll work - does every call to the base class have to be to an exposed member function i.e private/protected/public is simply ignored?
implicit inheritance of c++ code from a dll/so/dylib works by exporting every method in the class as a "decorated" symbol. The method name is decorated with the class, and type of every parameter. This is the same way the symbols are exported from static libraries (.a or .lib files iirc). Static or dynamic libraries, "private, protected etc." are always enforced by the compiler, parsing the header files, never the linker.
I'm quite well versed in C++ and template meta-programming and already have a fully reflective C++ system i.e member properties, member functions and global/static functions that uses boost.
c++ classes can typically only be exported from dlls with static linkage - dlls that are loaded at load, not via dlopen at runtime. COM allows c++ interfaces to be dynamically loaded by ensuring that all datatypes used in COM are either pod types, or are pure virtual interfaces. If you break this rule, by defining an interface that tries to pass a boost or any other type of object you will quickly get into a situation where the compiler/linker will need more than just the header file to figure out whats going on and your carefully prepared "com" dll will have to be statically or implicitly linked in order to function.
The other rule of COM is, never pass ownership of an object accross a dynamic library boundary. i.e. never return an interface or data from a dll, and require the app to delete it. Interfaces all need to implement IUnknown, or at least a Release() method, that allows the object to perform a delete this. Any returned data types likewise must have a well known de-allocator - if you have an interface with a method called "CreateBlob", there should probably be a buddy method called "DeleteBlob".