Can Anyone help me with this!(Weighted Loss)

Abas1 · April 25, 2025, 11:34am

Im trying to use weighted classes within the CrossEntropy loss function and there is no clear cut method to use in pytorch. Currently im using method 1 however this penalizes the loss by a greater amount than the second method. The class weights for method 1 are a lot higher for minority cases which means parameters get a more aggressive update and this potentially distorts the pretrained weights I assume . which is the standard in pytorch

Method 1: total/(samples_per_class × num_classes)

Method 2 : 1/samples per class

class_counts = train_df['level'].value_counts().sort_index().tolist()

total = len(train_df)
weights = [total/(len(class_counts)*c) for c in class_counts]
weights = torch.tensor(weights, dtype=torch.float32)
weights = weights.to(device)

"""
below is the current training distribution for prod, which we use to generate class weights
0    18067
2     3704
1     1710
3      611
4      496

"""
print(weights)

and I get a tensor of tensor([0.2722, 2.8754, 1.3271, 8.0607, 9.9333], device=‘cuda:0’)

KFrank · April 25, 2025, 7:55pm

Hi Abas!

As I read this, “total” and “num_classes” don’t depend on the class. Therefore the
two sets of weights differ only by a class-independent multiplicative factor and the
relative weights of the classes will be the same.

For common optimizers, this overall factor can be compensated for by dividing the
learning rate by that multiplicative factor – which is a fancy way of saying that the
particular choice doesn’t really matter.

As an aside, I would typically use “total / samples_per_class.” Using this version
prevents the overall scale factor of the weights from changing should you change
the size of your training set.

Best.

K. Frank