Label smoothing results in very large loss values, is this normal?

catqaq · June 13, 2022, 10:12am

Without label smoothing, loss is very small such as 0.3418：
loss = F.cross_entropy(preds, labels, label_smoothing=0)

With label smoothing, loss becomes very large(2604166656.0000), but the training process is normal：
loss = F.cross_entropy(preds, labels, label_smoothing=0.1)
So, is there anything wrong?

ptrblck · June 14, 2022, 5:47am

The value looks indeed quite large, but could be expected. Could you calculate the loss manually using loss smoothing and the plain cross entropy loss formula and compare it to the results?