Is it possible to send tunable parameter to CUDA before defining the optimizer for it?
device = torch.device(“cuda” if torch.cuda.is_available() else “cpu”)
param = torch.FloatTensor().clone().detach().requires_grad_(True)
param = param.to(device)
opt = torch.optim.SGD([param], lr = 0.01)
ValueError: can’t optimize a non-leaf Tensor
to() operation is differentiable and thus creates a non-leaf tensor. Move the data to the device first and create a leaf tensor afterwards via
Suppose I define a tensor, in order to tune it in CUDA, I have 3 steps:
- move it to CUDA
.requires_grad_() to it
When is the proper time to define my optimizer for it? Between 1 and 2? Does the order matter?