Not able to understand autograd

y.grad gets calculated but once it has been used it is discarded to save memory. How else would backprop work?

Others have asked how to get the gradient with respect to y. How to compute the gradients of non leaf variables in PyTorch

1 Like