Failed to initialize NVML: Driver/library version mismatch

Yesterday my users started reporting an error when running nvidia-smi:

Failed to initialize NVML: Driver/library version mismatch

Additionally, users report the following error when trying to run scripts:

kernel version 367.57.0 does not match DSO version 375.39.0

I see from my apt logs that yesterday morning an automated update installed 375.39 nvidia drivers. Apparently it was marked as a security update. Now nvidia-367 and nvidia-375 packages are both present in dpkg, but nvidia-367 is now described as “Transitional package for nvidia-375”.

We’re running Ubuntu 16.04.

It looks like a reboot fixed the problem. The unattended-upgrade must have left the system in an inconsistent state.

I had the same problem when using g2.2x and p2.2x from Amazon AWS. Rebooting the instances worked!

i get this error (filed to initialize NVML) only when inside the nvidia-docker container, outside everything is fine.
Reboot did not help. Driver version 375.39 . Any hints?

This happened to us again today. I assume this is because there is something in memory which conflicts with what’s on disk because of the automatic package upgrade. It would be nice if we could correct what’s in memory without having to reboot the system, as these are servers with multiple users.

Hello friend, I got the similar error when I attempt to run nvidia-smi in docker. Have you solved your problem?