Good morning!
I have been having a problem with Pytorch lately. I reformated the computer and installed everything I needed to work like torch. Then I ran many experiments which individialy use torch. They are working properly until torch, after some hours, suddenly says “CUDA driver initialization failed”.
The command nvidia-smi still working but something changed: the line under the name of the graphic card is not showing the power used (instead of 15W / 80W it says N/A / N/A). When I try to restart, the computer gets frozen so I have to force it. Then after a forced restart, everything is ok again.
I have used it in different conda environments, so, in principle, it is not the installation.
Do you know what to do in this case?
System: Ubuntu 22
Graphic Card: Nvidia 4070
Thank you very much in advance!
Sam.