Hi everyone. Whenever I instantiate a Variable() and call cuda() on it I seem to be unable to create an optimizer that optimizes over it:
x = Variable(torch.FloatTensor(some_np_array), requires_grad=True)
x = x.cuda()
optimizer = optim.SGD([x], lr=1e-2)
throws the exception
ValueError: can't optimize a non-leaf Variable
whereas not calling
.cuda() works fine. What is the correct way to optimize over variables that can be processed on the GPU?
The problem is that when you do
x = x.cuda(), the new x is not the same as the old one.
If you do
x_cuda = x.cuda(), then you can give
x to the optimizer.
It is even better to send it to cuda before creating the Variable:
x = Variable(torch.FloatTensor(some_np_array).cuda(), requires_grad=True)
Hi, I met the same problem. Thanks for your tips.
I wonder what’s the difference between
Indeed, when our computing is processed in GPU, why dont we optimize
x_cuda since it makes more sense？
That is what the last proposition above says: send the tensor on the GPU before making a Variable so that you can optimize this one directly.
@albanD I dnt understand why should give optimizer the x_cpu Instead of x_cuda??
Our computation is conducted on x_cuda, why not pass x_cuda to optimizer?
It all depends how you create them. Check this post for a detailed answer.