The relationship between OpenGL, GLX, DRI, and Mesa3D

game developmentlinuxopengl

I am starting out doing some low-level 3D programming in Linux. I have a lot of experience using the higher level graphics API OpenInventor.

I know it is not strictly necessary to be aware of how all these things fit together but I'm just curious. I know OpenGL is just a standard for graphics applications. Mesa3D seems to be an open source implementation of this standard.

So where do GLX and DRI fit? Digging around on Wikipedia and all these websites, I've yet to find an explanation of exactly how it all goes together. Where does hardware acceleration happen? What do proprietary drivers have to do with this?

Best Answer

Except OpenGL, I never used those libraries, but I'm going to try to guess, by reading wikipedia pages, like you did.

You seem right about Mesa. Here is the additional info we have :

"The X window system is a computer software system and network protocol that provides a basis GUIs for networked computers. It creates a hardware abstraction layer."

"GLX enables programs wishing to use OpenGL to do so within a window provided by the X Window System.
GLX consists of three parts:
- An API that provides OpenGL functions.
- An extension of the X protocol, which allows the client to send 3D rendering commands - An extension of the X server that receives the rendering commands from the client and passes them on to the installed OpenGL library
If client and server are running on the same computer and an accelerated 3D graphics card is available, the former two components can be bypassed by DRI. The client program is then allowed to directly access the graphics hardware."

"Direct Rendering Infrastructure (DRI) is an interface used in the X Window System to allow user applications to access the video hardware without requiring data to be passed through the X server."

"Open Inventor is a C++ 3D graphics API designed to provide a higher layer of programming for OpenGL"

To make things simpler, let's imagine a simplified flow of data (and commands) that happens at the entries and exits of each of those APIs. At the very beginning we have your application program (compiled code), that you run from your computer. At the end we have images that are displayed on your screen.

There are several cases which I will restrain to the answers to these questions:
-does your computer have a graphic card (GPU), or only a CPU, to process graphic functions ?
-is your application embedded in a window of the x-window system ?
-if you use the x window system, is the "x server" running on your computer or on an other computer on the network ?
I'll assume you have the drivers for your GPU if you have one, and that you have Mesa for the software rendering).

First scenario : you run a graphic application written with OpenInventor, without using to the X Window System, and you don't have a graphic card. The program flow would be quite similar to :

Your application
  ↓ (uses functions of)
OpenInventor
  ↓ (calls functions declared by)
OpenGL
  ↓ (redirects function calls to implementation defined by)
Mesa
  ↓ (implemented OpenGL functions to be run on the CPU)
[Probably] Operating System rendering API
  ↓
3D Images on your screen

What happens here is called "software rendering" : the graphics command are not handled by any graphic hardware, but instead by your usual CPU, the processor that generally runs software.

Second scenario : now imagine that with the same conditions as above, you have a graphic card. The flow would look more like this :

Your application
  ↓ (uses functions of)
OpenInventor
  ↓ (calls functions declared by)
OpenGL
  ↓ (redirects function calls to implementation defined by)
Proprietary Drivers
  ↓ (converts OpenGL commands to GPU commands)
Graphic Card
  ↓
3D Images on your screen

What happens now is called "hardware acceleration", usually faster than the first scenario.

Third scenario : now let's introduce the X Window System flow, or at least how I think it is, based on the few Wikipedia lines I read.
Let's forget about the graphic hardware and API for a while. The flow should look like :

Your application (X Window System sees it as an "X Client")
  ↓ (sends requests defined by the X Window System Core Protocol)
X Server
  ↓ (convert your request to graphic commands)
[Probably] Operating System rendering API
  ↓
Windows or 2D images on your screen

Note that when using the X Window System, your screen and the computer from which you run your application may not be "directly" connected, but could be connected through a network.

Fourth scenario : suppose you want to add fancy 3D graphic renderings to your X Client application from the previous example. It seems to me that the X Window System is not originally able to do this, or at least it would necessitate much convoluted code to perform the equivalent of an OpenGL API function.
Luckily you can use GLX to add support for OpenGL commands to the system. You now have :

Your application
  ↓ (sends graphic requests defined by the "GLX extension to the X Protocol")
X Server with the GLX extension
  ↓ (convert your request to OpenGL commands)
OpenGL
  ↓ (redirects function calls to implementation defined by)
 ...

Now you can reconnect that last arrow to the one after "OpenGL" in the first scenario : you can get 3D images on your screen !

Finally about what I think understand of the DRI :
It seems to allow Mesa to have access to the GPU, so that would modify the flow of our first scenario into :

...
  ↓
Mesa
  ↓ (forwards OpenGL commands)
DRI
  ↓ (converts OpenGL commands to GPU commands)
Graphic Card
  ↓
3D Images on your screen

And it also seems to short-circuit the flow when using GLX, given the condition that its server and client are on the same computer, and that you have a GPU. In that case the graph of our fourth scenario would simply become :

Your application
  ↓ (sends graphic requests defined by the "GLX extension to the X Protocol")
DRI
  ↓ ("catches" OpenGL commands and converts them to GPU commands)
Graphic Card
  ↓
3D Images on your screen

That's it !
Now keep in mind that I'm not an expert in Unix environments, so my best advice is to study the documentation of each of those APIs to know precisely what they can do.
Combining the previous chart into a single one might make things easier to understand. I let this as an exercice to you!