Var.detach().numpy() but want to compute the automatic gradient

takafusui · March 5, 2020, 2:59pm

Hi,

I am solving a non-linear optimization problem and use torch.autograd.grad to provide Jacobian that an optimizer requires.
Please see the code snippet below:

_stateplus = torch.stack([x[0], x[1], _ar1_plus]).numpy()
_controls_plus[z_idx, epsilon_idx, :] = policy_plus[z_idx].evaluate(
   _stateplus)

where x is a tensor variable with respect to which I want to compute the gradient.
I add the .numpy() argument, because the .evaluate() accepts only numpy array.

But in this case, I encounter the following error as expected:

RuntimeError: Can't call numpy() on Variable that requires grad. Use var.detach().numpy() instead.

I think the detach().numpy() argument breaks the computational graph, but I need a gradient with respect to x.

How can I avoid this situation?

albanD · March 5, 2020, 3:03pm

Hi,

Unfortunately, you will only be able to get gradients using autograd if you use pytorch’s Tensor and operators to do all the computations.

If you have an operation that you cannot do this way, you will need to provide the gradient formula yourself for that step. You can see how to extend the autograd in this Note.

takafusui · March 5, 2020, 3:12pm

Hi,

Thank you for the link.

So you meant var.detach().numpy() breaks the computational graph and we cannot compute the gradient correctly anymore by using torch.autograd.grad(). Is it correct?

albanD · March 5, 2020, 5:50pm

So you meant var.detach().numpy() breaks the computational graph and we cannot compute the gradient correctly anymore by using torch.autograd.grad() . Is it correct?

It is correct. Because you convert the Tensor into a numpy array, we cannot track gradients anymore.

takafusui · March 5, 2020, 9:06pm

Thank you. It is a bit sad news but I try to come up with another route.

albanD · March 5, 2020, 10:19pm

Unfortunately we cannot do magic For AD to work, we need to know everything that you do with the data. If we cannot know what you function did, we cannot compute gradients for it.

takafusui · March 6, 2020, 9:11am

Yes you are definitely correct. Thank you again.