GeForce GTX 1060 card in Dell R710 or R730xd for machine learning

dell-poweredgegraphics-processing-unit

We are investigating speeding up some machine learning code written using Theano and Keras, in particular by getting a GPU card. Does anyone have direct experience with this or a very similar combination? Specifically, we are interested in people's experiences about:

  • Is it physically possible to install a card such as a GTX 1060 in a Dell R710 or R730xd?
  • Is anything special required to get CentOS Linux to recognize the card, other than installing the necessary Nvidia drivers?
  • Are there any issues with respect to power, cooling, etc., we should worry about?

A similar question has been asked, but for a different card and operating system. Discussions elsewhere such as here suggest it's possible for similar hardware, but a bit tricky. Before having our organization buy the hardware, it would be helpful to know whether there are serious issues.

Best Answer

  1. You'll need the Nvidia proprietary drivers to use CUDA/OpenCL.

The card will need to be configured with X as the Nvidia drivers are X drivers, though it can still be configured to be "headless" and you can have multiple graphics cards.

Some details on running GPUs in headless servers from: https://sites.google.com/site/akohlmey/random-hacks/nvidia-gpu-coolness

Faking a "Head" for a Headless X Server The biggest remaining challenge is now to make the X server launch properly without having a display attached. Nowadays, display settings are negotiated between the X server and the display via EDID, and this is how we can simulate a display. The X server allows to override EDID settings and to define which display to configure through settings in the /etc/X11/xorg.conf file. All that is missing is a valid EDID file and this can be obtained from nvidia-settings through the "Acquire EDID" button, when examining the properties of a currently attached display (doesn't matter which one). In the xorg.conf file, something along the lines of the following has to be set.

Section "Screen"
    Identifier     "Screen0"
    Option         "UseDisplayDevice" "DFP-0"
    Option         "ConnectedMonitor" "DFP-0"
    Option         "CustomEDID" "DFP-0:/etc/X11/dfp-edid.bin"
    Option         "Coolbits" "5"
    .... End Section
  1. I found the drivers prepackaged in ELRepo

https://elrepo.org/tiki/tiki-index.php

They can also be downloaded from Nvidia's site, but that means no auto updating.

I can't say how the server will respond to having an additional GPU in it, but you may need to mess with the bios. According to the site mentioned above about configuring it as headless, you may need to boot the server with it configured as the primary graphics adapter or at least plugin a monitor temporarily to set it up with the nvidia utilities (to generate dfp-edid.bin).

Related Topic