Loss reduction sum vs mean: when to use each?

chaslie · September 7, 2021, 8:54am

hi Ptrblck,

I have a similar issue, but when i use MSEloss(reduction=‘sum’) and MSEloss(reduction=‘mean’) I do not see the difference in the calculated loss function as the size of the batch after the first iteration eg:

criterion = torch.nn.MSELoss(reduction='sum')
criterion(out,real_data_I)
tensor(40190.8242, device='cuda:0', grad_fn=<MseLossBackward>)

criterion = torch.nn.MSELoss()
criterion(out,real_data_I)
tensor(0.2726, device='cuda:0', grad_fn=<MseLossBackward>)

The batchsize is 9 so i would expect the loss to be either 4,465.647 or 2.4534 depending on which is correct?

I am more suspicious of the 0.276 value as this looks too low for the start og the model, but I don’t understand why the error.

chaslie