Confusion about loss.backward() and how it updates gradients

andreaskoepf · August 26, 2019, 11:50pm

Maybe a starting point is https://pytorch.org/docs/stable/notes/autograd.html

Behind the scenes PyTorch tracks all operations on tensors with requires_grad == true and builds a computation graph during the forward pass. It knowns how the loss value was calculated and can automatically back-propagate the gradient step by step from the loss (or any scalar model output) to the model parameters.