Weight argument of Loss Function

Tejan_Mehndiratta · March 9, 2021, 8:16am

Suppose I have a training set which consists of 4 classes and the number of samples belonging to the 4 classes is 20, 30, 40, 10 respectively. So should I pass the tensor torch.tensor([20,30,40,10]) / 100. to the weight argument of the loss function?
Or should I calculate the values of the weight argument for each batch on the fly in the training loop?

KFrank · March 9, 2021, 2:50pm

Hi Tejan!

You have this backwards – you want to weight the less-frequent
classes more heavily in your loss function. The most common
weighting scheme would be the reciprocal of what you have,
100.0 / torch.tensor ([20.0 ,30.0 ,40.0 ,10.0])

My preference is to calculate the weights using the frequency of
classes in the entire training set and use this single set of weights
for each batch.

Best.

K. Frank

Tejan_Mehndiratta · March 10, 2021, 6:44am

Hi,

But shouldn’t the sum of the weight vector equal to 1?

KFrank · March 10, 2021, 2:21pm

Hi Tejan!

No. This kind of use case applies to CrossEntropyLoss, which computes
a weighted mean (when using the default reduction = 'mean'). This
means that CrossEntropyLoss divides by the sum of the weights, so
the sum of the weights drops out of the final loss value.

(It doesn’t hurt to have the weights sum to one; it just doesn’t matter.)

Best.

K. Frank