Hi, I have a problem with running my code on a gpu. I am working on a remote server. We have several GPUs that of some of them are bigger. When I run my code on other gpus it runs correctly and I do not have any problem. But when I try the big that has: Memory: 60.63 GiB / 503.78 GiB (12.03%)
I would get the following error. I made a new-env and reinstalled pytorch but did not help: return torch.batch_norm( RuntimeError: cuDNN error: CUDNN_STATUS_EXECUTION_FAILED
I diactivated batch norm, resized the batch size but none of them helped.
I need to run my script on this Gpu. Would you please help me to solve this.
Which GPU are you using? Could you disable cudnn via torch.backends.cudnn.enabled = False and rerun your script? If that works, could you post the batchnorm setup as well as the input shapes?
Also, could you run python -m torch.utils.collect_env and post the output here?
I am using A100-SXM4-40GB, and all of GPUS are available. Indeed in our group we have 4 gpus of this type and I am running my script on the remote server.
See, I have v100 32GB, I have Nvidia nvidia 460.
and I have installed using conda install pytorch torchvision torchaudio cudatoolkit=11.1 -c pytorch nvidia
in conda environment.
Nothing I have to install outside conda environment. Try this in the new environment, hope this work.
Hi, yesterday when I created a new conda env and then I installed pytorch with cudatoolkit =11.2, it worked, but a half an hour later, when I tried it again it did not work. I got this error message when I was trying to install my own package:
WARNING: Value for scheme.headers does not match. Please report this to <https://github.com/pypa/pip/issues/9617> distutils: /home/envs/vir-env4/include/python3.9/UNKNOWN sysconfig: /home/anaconda3/envs/vir-env4/include/python3.9 WARNING: Additional context: user = False home = None root = None prefix = None Obtaining file:///mnt/home/Baysian Installing collected packages: Baysian-Seg Running setup.py develop for Baysian-Seg WARNING: Value for scheme.headers does not match. Please report this to <https://github.com/pypa/pip/issues/9617> distutils: /home/anaconda3/envs/vir-env4/include/python3.9/UNKNOWN sysconfig: /home/anaconda3/envs/vir-env4/include/python3.9 WARNING: Additional context: user = False home = None root = None prefix = None