Optimizing over .cuda() variables

Hi everyone. Whenever I instantiate a Variable() and call cuda() on it I seem to be unable to create an optimizer that optimizes over it:

x = Variable(torch.FloatTensor(some_np_array), requires_grad=True)
x = x.cuda()
optimizer = optim.SGD([x], lr=1e-2)

throws the exception

ValueError: can't optimize a non-leaf Variable

whereas not calling .cuda() works fine. What is the correct way to optimize over variables that can be processed on the GPU?

Hi,

The problem is that when you do x = x.cuda(), the new x is not the same as the old one.
If you do x_cuda = x.cuda(), then you can give x to the optimizer.
It is even better to send it to cuda before creating the Variable:

x = Variable(torch.FloatTensor(some_np_array).cuda(), requires_grad=True)
4 Likes

Hi, I met the same problem. Thanks for your tips.
I wonder what’s the difference between x and x_cuda?
Indeed, when our computing is processed in GPU, why dont we optimize x_cuda since it makes more sense?
Thanks.

That is what the last proposition above says: send the tensor on the GPU before making a Variable so that you can optimize this one directly.

@albanD I dnt understand why should give optimizer the x_cpu Instead of x_cuda??
Our computation is conducted on x_cuda, why not pass x_cuda to optimizer?

Hi,

It all depends how you create them. Check this post for a detailed answer.