Send parameters to CUDA before defining optimizer

Yoo · October 21, 2022, 7:40pm

Is it possible to send tunable parameter to CUDA before defining the optimizer for it?

device = torch.device(“cuda” if torch.cuda.is_available() else “cpu”)
param = torch.FloatTensor([1]).clone().detach().requires_grad_(True)
param = param.to(device)
opt = torch.optim.SGD([param], lr = 0.01)

I got
ValueError: can’t optimize a non-leaf Tensor

ptrblck · October 21, 2022, 10:44pm

The to() operation is differentiable and thus creates a non-leaf tensor. Move the data to the device first and create a leaf tensor afterwards via .requires_grad_().

Yoo · October 22, 2022, 5:39pm

Suppose I define a tensor, in order to tune it in CUDA, I have 3 steps:

move it to CUDA
add .requires_grad_() to it

When is the proper time to define my optimizer for it? Between 1 and 2? Does the order matter?