blackyang
(Xiao Yang)
#1
Hi, I have two loss functions whose return value has to be variables of cpu tensor and gpu tensor, respectively. Therefore, I cannot do:

```
loss = loss1 + loss2
loss.backward()
```

because `loss1.data`

is a cpu tensor and `loss2.data`

is a gpu tensor. How to correctly do back-propagation? Thanks!

SimonW
(Simon Wang)
#2
`loss1.gpu() + loss2`

or

`loss1 + loss2.cpu()`

or

`loss1.backward(); loss2.backward()`

etc.

1 Like

blackyang
(Xiao Yang)
#3
But a Variable doesn’t have gpu() or cpu(), right?

The third method is slow, because in my case loss1 and loss2 share many subgraphs below.

blackyang
(Xiao Yang)
#4
Just found a quick hack:

suppose `loss1`

is the cpu tensor Variable, then we can directly set:

```
loss1.data = loss1.data.cuda()
```

The gradients are also correct. Verified by a simple toy example:

```
import torch
from torch.autograd import Variable
x1 = Variable(torch.rand(10), requires_grad=True)
x2 = Variable(torch.rand(10).cuda(), requires_grad=True)
x1.data = x1.data.cuda()
y = x1 + 2 * x2
y.backward(y.data.clone().fill_(1))
print(x1.grad)
print(x2.grad)
```

SimonW
(Simon Wang)
#5
sorry I meant .cuda instead of .gpu

colesbury
(Sam Gross)
#6
Variable has a `.cuda()`

and a `.cpu()`

method. Gradients are also correctly back-propogated through the call.