Is there a way to make Autograd.grad behave element-wise?

I want to compute the gradient of a network with respect to its inputs. The network takes a 1-feature size-N batch of shape (N,1) and returns an output of the same shape.

If I pass in a (1,1) input inp, I can do it like this:

out = net(inp)
dNetdInp = torch.autograd.grad(out, inp, create_graph=True)[0]

But what if my input is (N,1)? If I try the same thing, I get a RuntimeError: grad can be implicitly created only for scalar outputs. What I want is a (N,1) tensor of the gradients, with the ith element in the tensor corresponding to the gradient of the net with respect to the ith input. How might I do this elegantly?


The error happens because out has ore than one element.
If you have a scalar loss and want the gradient wrt that, you can just compute it and call autograd.grad() on that.
If you want the gradient of the ith input wrt to the ith output. Then there are two cases:

  • If your net compute the value for each input element independently (no batchnorm or layer norm), then you can simply call autograd.grad() with out.sum() or grad_outputs=torch.ones_like(out).
  • If not, then you will have to do one backward for each element in your Tensor to make sure you can separate the gradient coming from each output element.

Thanks for the reply!

I want to apply a loss function to the gradients: in particular, I want to penalize negative gradients with respect to any input. If I got the (N,1) tensor dNetdInp, then I could simply take nn.ReLU(-dNetdInp) and then sum the result.

Is there a way to adapt your solution for this kind of use case? (Applying relu to the sum wouldn’t give me the right loss.)

Could grad_outputs=torch.ones_like(out) work for this?