NVIDIA – Fix NVML Initialization Error on VMware ESXI 6.7

nvidiavmware-esxvmware-esxivmware-vsphere

I am unable to setup the NVIDIA Tesla P100 Grid Setup on the vSphere Host Server with Vmware ESXI 6.7 on DELL EMC poweredge R740.

When I am trying to run nvidia-smi command I am getting following error

Failed to initialize NVML: Unknown Error

NVIDIA Driver as follows

#esxcli software vib list |grep -i nvidia 
NVIDIA-VMware_ESXi_6.7_Host_Driver 390.113-1OEM.670.0.0.8169922 NVIDIA VMwareAccepted 2019-03-06

And it also showing as module loaded in the OS as below

# vmkload_mod -l | grep nvidia
nvidia 0 13828

Also we done following changes in BIOS

Memory Mapped I/O above 4 GB - Enabled
Memory Mapped I/O above Base - 512 GB

Host OS : Vmware ESXI 6.7

NVIDIA Graphics Hardware : Tesla P100

Kindly help me to solve this issue

Best Answer

I have solved this problem myself. I have found the solution from an online resource. As, it is said in the above mentioned resource, to solve this problem, I had to disable the DirectPath I/O on the host .

The fix provided in the above mentioned resource is as follows.

You need to disable the “DirectPath I/O” on the host. Navigate to Hardware –> PCI Devices . Make sure the graphic card is not selected as passtrough device. Thanks to Simon Schaber from NVIDIA who gave me the final clue.