Compute class weight

nimning · October 30, 2018, 9:43pm

To handle unbalanced data, I would like to weight each class according to their data distribution.

It is very straightforward in Tensofrflow as the foloowing

from sklearn.utils.class_weight import compute_class_weight
generator_train = datagenerator_train.flow_from_directory(directory=train_dir,
                                                          target_size=input_shape,
                                                          batch_size=batch_size,
                                                          shuffle=True)
cls_train = generator_train.classes
class_weight = compute_class_weight(class_weight='balanced',
                                    classes=np.unique(cls_train),
                                    y=cls_train)

Does pytorch have similar native support? It seems I have to write some function to compute the weights

ptrblck · October 31, 2018, 12:11am

If you don’t know the targets beforehand, you would need to iterate all samples once and count all class occurrences. Once this is calculated, you could use the sklearn.utils.class_weight.compute_class_weight or just these two lines of code:

class_sample_count = np.unique(target, return_counts=True)[1]
weight = 1. / class_sample_count
samples_weight = weight[target]
samples_weight = torch.from_numpy(samples_weight)

Nishanth_Sasankan · July 1, 2019, 1:17am

I was trying to understand how to account for class imbalance while performing semantic segmentation.

So when you say

"If you don’t know the targets beforehand, you would need to iterate all samples once and count all class occurrences. Once this is calculated, you could use the "

do you mean how many pixels in a label map database correspond to a certain class or how many times does a given unique label occur in a dataset?

Thanks
Nishanth

MA_CASANDRA_QUILANG · January 22, 2021, 9:15am

Hi! I have the same question. Did you find any answer to this?

doem97 · July 24, 2021, 5:45pm

ptrblck’s answer is clear and concise I think.