`torch.svd()` runs on CPU while supposed to run on GPU


I am running PyTorch 1.2.0 with Conda on a remote Linux machine, but I built PyTorch from source using the CUDA toolkit 9.1. I checked whether the GPU is available with torch.cuda.is_available() and it is.

Using the Conda pre-built PyTorch was not a option for me, as the GPU drivers on the remote machine are too old for the CUDA toolkit 9.2 or 10, for which PyTorch binaries are provided.

I have a BxNxM tensor t on GPU, with B around 100000. I call torch.svd(t) and the GPU is idle, while the CPU starts running at 100% and more. At the end of the computation, the resulting decomposition is on GPU. In addition, trying to run on GPU is far slower than on CPU, basically unusable.

The weird thing is that, before to call torch.svd(t), I do computations on the GPU without any issue. It seems that the SVD computation is moved to the CPU and then the results moved back to the GPU.

My final goal is to compute the nuclear norm of each NxM matrix in my BxNxM tensor, therefore I tried to use also torch.norm(t, p='nuc', dim=(1, 2)), but the same scenario described above takes place.

Any guess? Thanks.


Nothing within pytorch will move the data between CPU/GPU automatically. So the computations are definitely not done on the CPU.
I am not sure though that the batched version of svd exists. It may be doing a for-loop over the batch under the hood and call svd on each submatrix. If your NxM matrices are small, then you would see a very high cpu usage due to the launching of the kernels for each matrix while the GPU will stay almost idle as it has almost nothing to do.
Unfortunately, this is expected behavior for the GPU: unless you have large parallelizable operations, it will be less efficient than the CPU :confused:

1 Like