Pytorch autograd.grad:ValueError: only one element tensors can be converted to Python scalars

In the documentation of torch.autograd.grad , it is stated that, for parameters,

parameters:

outputs (sequence of Tensor) – outputs of the differentiated function.

inputs (sequence of Tensor) – Inputs w.r.t. which the gradient will be returned (and not accumulated into .grad).

I try the following:

a = torch.rand(2, requires_grad=True)
b = torch.rand(2, requires_grad=True)
c = a+b
d = a-b

torch.autograd.grad([c, d], [a, b]) #ValueError: only one element tensors can be converted to Python scalars
torch.autograd.grad(torch.tensor([c, d]), torch.tensor([a, b])) #RuntimeError: grad can be implicitly created only for scalar outputs

I would like to get gradients of a list of tensors w.r.t another list of tensors. What is the correct way to feed the parameters?

Hi,

The thing is that to backpropagate, you need to provide the original signal to backpropagate.
For functions that return a scalar, then the It’s natural to use 1 to get gradients and so that’s what we do.
For functions that return more than one value, there is no natural signal here. So you need to provide it with the grad_outputs argument.

1 Like