The .cuda()
operation on the nn.Parameter
is differentiable and will create a non-leaf tensor.
Remove the .cuda()
operation and call it on the nn.Module
instead or alternatively call it on the tensor before wrapping it into the nn.Parameter
.
1 Like