I want to optimize a single variable separately. So I define the variable as m = Variable(torch.rand(3,2).cuda().float(), requires_grad=True), then I get the optimizer as opt = optim.Adam(itertools.chain([nn.Parameter(m)]), lr=0.001). I define a loss function which requires retain_graph=True when calling backward. However, the gradient is of NoneType for m after that. How can I fix this?
Could you give the definition of loss function ?