About average loss

Hi can anybody tell me when should we use average loss for training? I found that in my case when I used loss = mse (output,label) as the loss for learning curve, it would sometimes increase. But if I used average loss instead, the learning curve looks good. I calculate the average loss with the following:
for epoch in range():
for i, batched_images in enumerate(dataloader):
loss = mse(input,label)
running_loss + = loss.item()
num_images += input.size(0)

Aveg_loss = running_loss / num_images

Any information would be appreciated:)

The batch loss could be noisy as some batches could yield a higher loss than the previous one (e.g. if the batch contains “hard” samples). However, since the training would generally decrease the overall loss, calculating the average would smooth the curve.

Thank you for your reply. That makes sense.