GPU implementation of QR\SVD

(wir sind die roboter 🤖) #1


Is it possible to use GPU accelerated factorizations? I can’t find any information in the documentation and it seems that in version “0.3.0.post4” pytorch copies the tensor onto RAM and runs a CPU implementation of the factorizations.


torch.qr and torch.svd both run on the GPU if your input is a CUDA tensor. They might also run parts of the algorithm on the CPU.

(wir sind die roboter 🤖) #3


I also thought that if the input tensor is a cuda array then it should use the GPU but running

T = torch.randn(400,400).cuda()
 for i in range(100):
    somevar = torch.qr(T)

causes only a marginal increase in volatile GPU utilization (8-10%)
but blows up all CPU cores.

(linyu) #4

Hi, have you solved your problem? I have the same problem as you.