NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver

Hi Team,

I have one work station and ubantu 18.04 installed in it. I installed below nvidea driver

  1. Cuda 10.0
  2. Nvidea nvidia-driver-415

I need to run one software for Research Purpose and software need to run inGPU mode So proper Nvidia Drive should be installed and in proper way.

I am verifying GPU mode by running command “NVIDIA-SMI” but i am getting below error.

vivek@Workstation1:~$ sudo nvidia-smi
[sudo] password for vivek:
NVIDIA-SMI has failed because it couldn’t communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

Can you please help me on this?

Thanks
Rajeev
rajeevsys@gmail.com

nvidia-bug-report.log.gz (54.8 KB)

How did you install the driver? Please run nvidia-bug-report.sh as root and attach the resulting .gz file to your post. Hovering the mouse over an existing post of yours will reveal a paperclip icon.
[url]https://devtalk.nvidia.com/default/topic/1043347/announcements/attaching-files-to-forum-topics-posts/[/url]

Please find attached nvidia-bug-report.sh report,

Hope this will help you get some insite. This is bit more urgent so can you please help me on this this.
nvidia-bug-report.log.gz (54.8 KB)

The kernel module can’t be compiled because the kernel headers are missing:

ERROR: Failed to run `/sbin/dkms build -m nvidia -v 410.78 -k 3.10.0-957.1.3.el7.x86_64`: Error! echo
Your kernel headers for kernel 3.10.0-957.1.3.el7.x86_64 cannot be found at
/lib/modules/3.10.0-957.1.3.el7.x86_64/build or /lib/modules/3.10.0-957.1.3.el7.x86_64/source.
-> error.
ERROR: Failed to install the kernel module through DKMS. No kernel module was installed; please try installing again without DKMS, or check the DKMS logs for more information.
ERROR: Installation has failed.  Please see the file '/var/log/nvidia-installer.log' for details.  You may find suggestions on fixing installation problems in the README available on the Linux driver download page at www.nvidia.com.

Please install the packages kernel-devel and kernel-headers and make sure you’re running the matching kernel version.

BTW, according to your logs, you’re running CentOS Linux release 7.6.1810, not Ubuntu.

You are right, I was facing problem in ubanto so in install centos.
Can we have screen facility if yes so how we can do this.

sudo dnf install kernel-devel kernel-headers
then reinstall the drivers

sudo dpkg -i cuda-repo-ubuntu1804-10-0-local-10.0.130-410.48_1.0-1_amd64.deb
sudo apt-key add /var/cuda-repo-10-0-local/7fa2af80.pub
sudo apt-get update
sudo apt-get install cuda-toolkit-10-0
  • reboot

I run below command as suggested in above but still i am facing issue.

Can you please check and suggest also attaching bug newly created bug report.
Also attaching nvidia-installer.log files.

nvidia-bug-report.log.gz (57.1 KB)
nvidia-installer.log (3.65 KB)

The kernel module is now installed but you have secure boot enabled so it can’t be loaded since it isn’t signed. Please disable secure boot in bios.