Interpreting loss value

Thanks a lot for your help. I was looking at the Transfer Learning Tutorial and I found the normalization it’s done with:

transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])

It seems that resnet18 was trained using this normalization, so I am using that one and the results seem to improve somewhat.

When it comes to the loss part, he calculates it like this:

 # statistics
running_loss += loss.item() * inputs.size(0)
running_corrects += torch.sum(preds == labels.data)
epoch_loss = running_loss / dataset_sizes[phase]
epoch_acc = running_corrects.double() / dataset_sizes[phase]

With phase being either ‘train’ or ‘val’ and dataset_sizes[phase] being the all of the examples used in training or validation. So yes he does divide by all of the samples but before accumulating the loss value he multiplies by all the examples in one batch.

Any insight on why this is done this way?

If I divide the loss value by the totality of examples like you suggest, I get really small loss values, like 0.003 and 0.002.
Help.