Targeted Image Augmentation

MUSTAFA_CAGRI_CIVICI · January 20, 2025, 12:20pm

Hello everyone,

I am working on the model that will be trained on 4 channels medical data. I noticed that labels of the images are not distributed fairly. In order to fetch the importance of the low-weighted class more, I am planning to augment the train data. But if i apply generic image augmentation, it might be no effect because the augmentation will effect the same on each class. I need to augment especially low weighted class so that the model would see them a lot. I can tune it manually like low classes will be augmented 5 times while high class will be 2 times. But it is too risky for the model to be manipulated.
Are there any methods or advices that you have or you like using? It would be very appreciated for me.

KFrank · January 21, 2025, 1:31am

Hi Mustafa!

I don’t see anything wrong with augmenting classes that appear less
frequently more heavily than classes that appear more frequently.

But my guess is that you would be a little bit better off using a
WeightedRandomSampler to sample the less-frequent classes more
heavily (and if you do augment your training data, don’t augment any
differently based on class).

(You could also use the weight constructor argument for
CrossEntropyLoss, if this is a multi-class classification problem,
or the conceptually similar pos_weight constructor argument for
BCEWithLogitsLoss, if this is a binary classification problem. But
I would prefer the WeightedRandomSampler approach unless it is
likely that any given batch would have duplicate samples in it.)

Best.

K. Frank

MUSTAFA_CAGRI_CIVICI · January 21, 2025, 8:44am

Many thanks @KFrank, I will work on it now. Thanks for sharing your advice.