Calculating training loss - sum of avg batch losses or avg loss across all samples

YASJAY · April 2, 2022, 5:03pm

Hi,

how is loss calculated during training,

is it batchwise average or overall data average?

Andrei_Cristea · April 2, 2022, 8:18pm

Typically the loss is calculated for each batch separately.

In other words, inside your training loop that iterates over batches, you have these commands:

optimizer.zero_grad()
loss = calculate_loss(*args)
loss.backward()

So loss is getting overwritten for each batch, which is the same cadence with which the optimizer steps in the gradient direction.