Why do we need to pass the gradient parameter to the backward function in PyTorch?

I’m pretty sure when you do something like y.backward(torch.ones_like(y)) you’re just telling autograd to repeat .backward() for each element in y under the hood.

1 Like