When I run my scripts on this remote GPU, it takes more than double the time than it normally takes on Google Colab.
The CUDA version of remote GPU is 9.1.x
while the PyTorch I installed requires 10.2
.
Is it possible this version mismatch is causing increased timing?
Is it possible to upgrade CUDA on the remote server?
Kushaj
(Kushajveer Singh)
May 22, 2020, 5:06pm
2
Remove pytorch from the server. Then you can use !conda install pytorch torchvision cudatoolkit=10.2 -c pytorch
in jupyter to download pytorch with cuda 10.2. If you use pip then use !pip install torch torchvision
.
I already did that, still the output of cat /usr/local/cuda/version.txt
shows CUDA Version 9.1.85
.
Kushaj
(Kushajveer Singh)
May 22, 2020, 5:25pm
4
What is the output of torch.version.cuda
?
The output of torch.version.cuda
is 10.2
.
Kushaj
(Kushajveer Singh)
May 22, 2020, 5:27pm
6
So pytorch is using cuda 10.2
Kushaj
(Kushajveer Singh)
May 22, 2020, 5:28pm
7
Also, conda install cuda in anaconda directory where all conda packages are stored. So checking cuda version of local machine will not matter.
Well, I agree but why isn’t it showing under the directory /usr/local/
as similar to 9.1.x
?
Also, time it takes for training is also way higher compared to Google Colab.
Kushaj
(Kushajveer Singh)
May 22, 2020, 5:31pm
9
If you are using conda, then it is not installed in /usr/local/
. It is in ~anaconda/envs/{something}
.
Kushaj
(Kushajveer Singh)
May 22, 2020, 5:38pm
10
There may be other reasons for the slowdown. Slow CPU, worse GPU, slow storage type.