Loss divided by the item() of itself

Hi there, I am just wondering that in Pytorch, if I do:

loss=loss/loss.item(), I know that the resulting loss will be having a scalar value of 1. Do the gradients of the loss still preserve? What is the difference between loss/loss and loss/loss.item() ?

loss_norm = loss/loss.item() will have gradient in the same direction as loss, but scaled by `1/loss.item() because the gradient flows through loss but not loss.item() (which is a constant number to autograd).

1 Like