CentOS 7 w/Gnome hangs on boot after Nvidia driver installation


there is a lot of information available on these topics separately, but I haven't been able to find an answer to what I feel is a really common situation.

I have 2 Nvidia GTX 1080s in a server with CentOS 7 and Gnome desktop. The GPUs are going to be used exclusively for CUDA calculation, not video output.

See screenshot of Kernel loading screen.

Load screen of GUI after Nvidia driver installation

My xorg.conf looks like this:

[root@0cc47a8a1a10 ~]# cat /etc/X11/xorg.conf
# nvidia-xconfig: X configuration file generated by nvidia-xconfig
# nvidia-xconfig:  version 367.44  (buildmeister@swio-display-x86-rhel47-01)  Wed Aug 17 22:54:35 PDT 2016

Section "ServerLayout"
    Identifier     "Layout0"
    Screen      0  "Screen0"
    InputDevice    "Keyboard0" "CoreKeyboard"
    InputDevice    "Mouse0" "CorePointer"

Section "Files"
    FontPath        "/usr/share/fonts/default/Type1"

Section "InputDevice"
    # generated from default
    Identifier     "Mouse0"
    Driver         "mouse"
    Option         "Protocol" "auto"
    Option         "Device" "/dev/input/mice"
    Option         "Emulate3Buttons" "no"
    Option         "ZAxisMapping" "4 5"

Section "InputDevice"
    # generated from default
    Identifier     "Keyboard0"
    Driver         "kbd"

Section "Monitor"
    Identifier     "Monitor0"
    VendorName     "Unknown"
    ModelName      "Unknown"
    HorizSync       28.0 - 33.0
    VertRefresh     43.0 - 72.0
    Option         "DPMS"

Section "Device"
    Identifier     "Device0"
    Driver         "nvidia"
    VendorName     "NVIDIA Corporation"

Section "Screen"
    Identifier     "Screen0"
    Device         "Device0"
    Monitor        "Monitor0"
    DefaultDepth    24
    SubSection     "Display"
        Depth       24

[root@0cc47a8a1a10 ~]#

Here's the last part of /var/log/Xorg.5.log:

[    37.157] (==) ModulePath set to "/usr/lib64/xorg/modules"
[    37.157] (WW) Hotplugging is on, devices using drivers 'kbd', 'mouse' or 'vmmouse' will be disabled.
[    37.157] (WW) Disabling Keyboard0
[    37.157] (WW) Disabling Mouse0
[    37.157] (II) Loader magic: 0x7fd419fc1020
[    37.157] (II) Module ABI versions:
[    37.157]    X.Org ANSI C Emulation: 0.4
[    37.157]    X.Org Video Driver: 19.0
[    37.157]    X.Org XInput driver : 21.0
[    37.157]    X.Org Server Extension : 9.0
[    37.157] (II) xfree86: Adding drm device (/dev/dri/card1)
[    37.157] (II) xfree86: Adding drm device (/dev/dri/card2)
[    37.157] (II) xfree86: Adding drm device (/dev/dri/card0)
[    37.157] (II) xfree86: Adding drm device (/dev/dri/card3)
[    37.157] (II) xfree86: Adding drm device (/dev/dri/card4)
[    37.165] (--) PCI: (0:2:0:0) 10de:1b80:10de:119e rev 161, Mem @ 0xcf000000/16777216, 0x383fe0000000/268435456, 0x383ff0000000/33554432, I/O @ 0x00006000/128, BIOS @ 0x????????/524288
[    37.165] (--) PCI: (0:3:0:0) 10de:1b80:10de:119e rev 161, Mem @ 0xcd000000/16777216, 0x383fc0000000/268435456, 0x383fd0000000/33554432, I/O @ 0x00005000/128, BIOS @ 0x????????/524288
[    37.165] (--) PCI:*(0:6:0:0) 1a03:2000:15d9:0852 rev 48, Mem @ 0xcb000000/16777216, 0xcc000000/131072, I/O @ 0x00004000/128, BIOS @ 0x????????/131072
[    37.165] (--) PCI: (0:131:0:0) 10de:1b80:10de:119e rev 161, Mem @ 0xfa000000/16777216, 0x387fe0000000/268435456, 0x387ff0000000/33554432, I/O @ 0x0000d000/128, BIOS @ 0x????????/524288
[    37.165] (--) PCI: (0:132:0:0) 10de:1b80:10de:119e rev 161, Mem @ 0xf8000000/16777216, 0x387fc0000000/268435456, 0x387fd0000000/33554432, I/O @ 0x0000c000/128, BIOS @ 0x????????/524288
[    37.165] (II) LoadModule: "glx"
[    37.165] (II) Loading /usr/lib64/xorg/modules/extensions/libglx.so
[    37.171] (II) Module glx: vendor="NVIDIA Corporation"
[    37.171]    compiled for 4.0.2, module version = 1.0.0
[    37.171]    Module class: X.Org Server Extension
[    37.171] (II) NVIDIA GLX Module  367.44  Wed Aug 17 21:50:26 PDT 2016
[    37.171] (II) LoadModule: "nvidia"
[    37.171] (II) Loading /usr/lib64/xorg/modules/drivers/nvidia_drv.so
[    37.171] (II) Module nvidia: vendor="NVIDIA Corporation"
[    37.171]    compiled for 4.0.2, module version = 1.0.0
[    37.171]    Module class: X.Org Video Driver
[    37.171] (II) NVIDIA dlloader X Driver  367.44  Wed Aug 17 21:28:13 PDT 2016
[    37.171] (II) NVIDIA Unified Driver for all Supported NVIDIA GPUs
[    37.171] (++) using VT number 1

[    37.171] (EE) No devices detected.
[    37.171] (EE)
Fatal server error:
[    37.171] (EE) no screens found(EE)
[    37.171] (EE)
Please consult the The X.Org Foundation support
         at http://wiki.x.org
 for help.
[    37.171] (EE) Please also check the log file at "/var/log/Xorg.5.log" for additional information.
[    37.171] (EE)

Best Answer

It turns out that the Nvidia driver installer clobbers some files involving libglx.so. I don't know what files exactly, and copying the original libglx.so over the one Nvidia sticks there didn't bring things back either.

Using the "--no-opengl-files" installation flag, and selecting "No" when prompted to overwrite the xconfig during installation resolved this issue.

In more detail, the steps were:

  • Install CentOS 7 with Gnome desktop
  • After boot:
    • yum -y update
    • yum -y install kernel-devel epel-release
    • yum -y install dkms gcc gcc-g++
    • Reboot (to get to new kernel)
  • After boot:
    • sh latest_nvidia_driver.run --no-opengl-files
    • Select "no" when prompted for xconfig overwrite
    • systemctl set-default graphical.target (if your default run level is not already graphical)
    • Reboot
Related Topic