How does PyTorch handle negative loss functions?

Say for example I had a

loss = -nn.MSELoss(x1, x2)

Would PyTorch drive the loss between x1 and x2 towards -infinity or would it go towards 0?

Hi Connor!

The standard pytorch optimizers attempt to minimize the loss, that is
to make it algebraically smaller (less positive, more negative).

As an aside, because they use variations on gradient descent, they
don’t care about the overall level of the loss – where zero is – just
what direction it is going. So lossA and lossB = lossA - 1,000
get optimized the same way.


K. Frank