Compute class weight

To handle unbalanced data, I would like to weight each class according to their data distribution.

It is very straightforward in Tensofrflow as the foloowing

from sklearn.utils.class_weight import compute_class_weight
generator_train = datagenerator_train.flow_from_directory(directory=train_dir,
                                                          target_size=input_shape,
                                                          batch_size=batch_size,
                                                          shuffle=True)
cls_train = generator_train.classes
class_weight = compute_class_weight(class_weight='balanced',
                                    classes=np.unique(cls_train),
                                    y=cls_train)

Does pytorch have similar native support? It seems I have to write some function to compute the weights

1 Like

If you don’t know the targets beforehand, you would need to iterate all samples once and count all class occurrences. Once this is calculated, you could use the sklearn.utils.class_weight.compute_class_weight or just these two lines of code:

class_sample_count = np.unique(target, return_counts=True)[1]
weight = 1. / class_sample_count
samples_weight = weight[target]
samples_weight = torch.from_numpy(samples_weight)
7 Likes

I was trying to understand how to account for class imbalance while performing semantic segmentation.

So when you say

"If you don’t know the targets beforehand, you would need to iterate all samples once and count all class occurrences. Once this is calculated, you could use the "

do you mean how many pixels in a label map database correspond to a certain class or how many times does a given unique label occur in a dataset?

Thanks
Nishanth

Hi! I have the same question. Did you find any answer to this?

1 Like

ptrblck’s answer is clear and concise I think.