CUDNN_STATUS_NOT_INITIALIZED when trying to run Keras, but not TensorFlow!
Curious. I could run a code in TF and it would use the GPU with absolutely no problem. But I had another code with Keras that would not run!
After hours of pulling my hair out and cursing I figured it out. I have access to another machine and the same Keras code was running on that one with no problems. I figured that it could be the version of the NVIDIA driver, or CUDA or CuDNN.
On the faulty machine I had installed CuDNN using the “linux archive” instead of a simple “deb” file. So, I installed CuDNN that way, but nothing changed.
The NVIDIA driver on the faulty machine was 390.77 but on the good machine was 396.54. So, I set to install NVIDIA 396 on Ubuntu 18.04.
First you need to delete your current driver:
Then add the graphics card driver repository:
Then you need to install the driver:
Reboot.
That’s it. After the reboot everything should be working now.
After hours of pulling my hair out and cursing I figured it out. I have access to another machine and the same Keras code was running on that one with no problems. I figured that it could be the version of the NVIDIA driver, or CUDA or CuDNN.
On the faulty machine I had installed CuDNN using the “linux archive” instead of a simple “deb” file. So, I installed CuDNN that way, but nothing changed.
The NVIDIA driver on the faulty machine was 390.77 but on the good machine was 396.54. So, I set to install NVIDIA 396 on Ubuntu 18.04.
First you need to delete your current driver:
sudo apt purge nvidia-*
Then add the graphics card driver repository:
sudo add-apt-repository ppa:graphics-drivers/ppa
Then you need to install the driver:
sudo apt install nvidia-driver-396 nvidia-utils-396 nvidia-kernel-common-396
Reboot.
That’s it. After the reboot everything should be working now.
Comments