I am using the mse loss described here.

https://pytorch.org/docs/stable/nn.html?highlight=mse#torch.nn.functional.mse_loss

`torch.nn.functional.`

`mse_loss`

(input,target,size_average=None,reduce=None,reduction=‘mean’) → Tensor

My input and target are of size [16, 2, 48, 120] i.e. a batch size of 16 where each item is a tensor of size [2, 48, 120].

Supplying the argument `reduction="sum"`

returns `tensor(3304.8472, grad_fn=<MseLossBackward>)`

whereas the argument `reduction="mean"`

returns `tensor(0.0747, grad_fn=<MseLossBackward>)`

This probably means that the method is averaging over the total number of pixel instead of averaging only by the batch size. What can I do to sum over every pixel then divide by the batch size ?