How to wrap a cuda tensor Variable to a cpu tensor Variable?

blackyang · October 18, 2017, 2:16am

Hi, I have two loss functions whose return value has to be variables of cpu tensor and gpu tensor, respectively. Therefore, I cannot do:

loss = loss1 + loss2
loss.backward()

because loss1.data is a cpu tensor and loss2.data is a gpu tensor. How to correctly do back-propagation? Thanks!

SimonW · October 18, 2017, 2:32am

loss1.gpu() + loss2

or

loss1 + loss2.cpu()

or

loss1.backward(); loss2.backward()

etc.

blackyang · October 18, 2017, 3:03am

But a Variable doesn’t have gpu() or cpu(), right?

The third method is slow, because in my case loss1 and loss2 share many subgraphs below.

blackyang · October 18, 2017, 3:51am

Just found a quick hack:

suppose loss1 is the cpu tensor Variable, then we can directly set:

loss1.data = loss1.data.cuda()

The gradients are also correct. Verified by a simple toy example:

import torch
from torch.autograd import Variable

x1 = Variable(torch.rand(10), requires_grad=True)
x2 = Variable(torch.rand(10).cuda(), requires_grad=True)

x1.data = x1.data.cuda()

y = x1 + 2 * x2
y.backward(y.data.clone().fill_(1))

print(x1.grad)
print(x2.grad)

SimonW · October 18, 2017, 4:17am

sorry I meant .cuda instead of .gpu

colesbury · October 18, 2017, 6:11am

Variable has a .cuda() and a .cpu() method. Gradients are also correctly back-propogated through the call.

blackyang · October 18, 2017, 5:43pm

Great, thanks a lot!