Dear all:
i know when
x_cuda = x_cpu.to(device)
It will trigger error:
ValueError: can’t optimize a non-leaf Tensor
when you use optimizer = optim.Adam([x_cuda])
. The right way may be optimizer = optim.Adam([x_cpu])
. That’s to way, we need keep both reference of x_cpu
and x_cuda
.
Since in most case, our program will only keep a reference of the cuda
version of tensor, such as :
self.vars = [
# [28*28, 512]
torch.ones(512, 2 * 28 * 28, requires_grad=True).to(device),
torch.zeros(512, requires_grad=True).to(device),
# [512, 256]
torch.ones(256, 512, requires_grad=True).to(device),
torch.zeros(256, requires_grad=True).to(device),
# [256, n]
torch.ones(n_class, 256, requires_grad=True).to(device),
torch.zeros(n_class, requires_grad=True).to(device)
]
So i wonder how to pass the parameters to optimizer
when We dnt want to keep the reference of cpu
version of tensor?