CUDNN_STATUS_NOT_INITIALIZED when trying to run Keras, but not TensorFlow!

Curious. I could run a code in TF and it would use the GPU with absolutely no problem. But I had another code with Keras that would not run!

After hours of pulling my hair out and cursing I figured it out. I have access to another machine and the same Keras code was running on that one with no problems. I figured that it could be the version of the NVIDIA driver, or CUDA or CuDNN.

On the faulty machine I had installed CuDNN using the “linux archive” instead of a simple “deb” file. So, I installed CuDNN that way, but nothing changed.

The NVIDIA driver on the faulty machine was 390.77 but on the good machine was 396.54. So, I set to install NVIDIA 396 on Ubuntu 18.04.

First you need to delete your current driver:

sudo apt purge nvidia-*

Then add the graphics card driver repository:

sudo add-apt-repository ppa:graphics-drivers/ppa

Then you need to install the driver:

sudo apt install nvidia-driver-396 nvidia-utils-396 nvidia-kernel-common-396

Reboot.

That’s it. After the reboot everything should be working now.

Comments