I had a look at this tutorial in the PyTorch docs for understanding Transfer Learning. There was one line that I failed to understand.
After the loss is calculated using loss = criterion(outputs, labels)
, the running loss is calculated using running_loss += loss.item() * inputs.size(0)
and finally, the epoch loss is calculated using running_loss / dataset_sizes[phase]
.
Isn’t loss.item()
supposed to be for an entire mini-batch (please correct me if I am wrong). i.e, if the batch_size
is 4, loss.item()
would give the loss for the entire set of 4 images. If this is true, why is loss.item()
being multiplied with inputs.size(0)
while calculating running_loss
? Isn’t this step like an extra multiplication in this case?
Any help would be appreciated. Thanks!