Optimization – When to Offload Work to a GPU Instead of CPU

cpugpuoptimization

Newer systems such as OpenCL are being made so that we can run more and more code on our graphics processors, which makes sense, because we should be able to utilise as much of the power in our systems as possible.

However, with all of these new systems, it seems as if GPUs are better than CPUs in every way. Because GPUs can do parallel calculation, multi-core GPUs actually seem like they'd be much better than multi-core CPUs; you'd be able to do many calculations at once and really improve speed. Are there still certain cases where serial processing is still better, faster, and/or more efficient than parallel?

Best Answer

However, with all of these new systems, it seems as if GPUs are better than CPUs in every way.

This is a fundamental mis-understanding. Present GPU cores are still limited compared to current top-line CPUs. I think NVIDIA's Fermi architecture is the most powerful GPU currently available. It has only 32-bit registers for integer arithmetic, and less capability for branch prediction and speculative execution then a current commodity Intel processor. Intel i7 chips provide three levels of caching, Fermi cores only have two, and each cache on the Fermi is smaller than the corresponding cache on the i7. Interprocess communication between the GPU cores is fairly limited, and your calculations have to be stuctured to accommodate that limitation (the cores are ganged into blocks, and communication between cores in a block is relatively fast, but communication between blocks is slow).

A significant limitation of current GPUs is that the cores all have to be running the same code. Unlike the cores in your CPU, you can't tell one GPU core to run your email client, and another core to run your web server. You give the GPU the function to invert a matrix, and all the cores run that function on different bits of data.

The processors on the GPU live in an isolated world. They can control the display, but they have no access to the disk, the network, or the keyboard.

Access to the GPU system has substantial overhead costs. The GPU has its own memory, so your calculations will be limited to the amount of memory on the GPU card. Transferring data between the GPU memory and main memory is relatively expensive. Pragmatically this means that there is no benefit in handing a handful of short calculations from the CPU to the GPU, because the setup and teardown costs will swamp the time required to do the calculation.

The bottom line is that GPUs are useful when you have many (as in hundreds or thousands) of copies of a long calculation that can be calculated in parallel. Typical tasks for which this is common are scientific computing, video encoding, and image rendering. For an application like a text editor the only function where a GPU might be useful is in rendering the type on the screen.

Related Topic