WeightedRandomSampler for object detection?

Aquafina · May 13, 2025, 2:59am

Hello, as of late I’ve been working with the VinDr mammogram dataset to try to build a object detection + classification model to detect and classify lesions by BIRADS score. A single mammogram image may or may not have a lesion; for those that do, each lesion will have a corresponding bounding box and BIRADS evaluation given in a csv file. Some mammograms may have more than one lesion.

The dataset is pretty imbalanced, with the majority of cases being BIRADS-1 (no findings). As such, I was thinking about using a WeightedRandomSampler. Normally, the weights are assigned by 1/the frequency of some given instance’s label. However, in this dataset, one image can have multiple labels, since an image can contain multiple lesions each with its own BIRADS score. Thus, how should I go about assigning weights? Or, are there better ways to address class imbalance?

KFrank · May 15, 2025, 6:06pm

Hi Aquafina!

When detecting and classifying a lesion in one part of an image, is your model influenced
by having another lesion somewhere else in the same image? (Some detection
architectures would not be significantly influenced.) If this is the case, do you have a
reasonable number of single-lesion images in your training set that cover all of your
classes?

If so, your problem becomes easier. Look at the distribution of classes across your
training set, counting each lesion separately (regardless of whether it’s in a single- or
multiple-lesion image). Then use WeightedRandomSampler to more heavily weight
single-lesion images that belong to the under-represented classes.

If you don’t have an adequate selection of single-lesion images, or if the way your model
detects and classifies lesions is sensitive to the number of lesions in an image, you
(probably) won’t be able to use WeightedRandomSampler effectively.

(It is hypothetically possible that you could find a set of image reweightings that, when
applied across all of the images in your training set, gives you an overall set of lesions
that are reasonably well balanced. But this will probably not be the case.)

In the case that you can’t effectively reweight your training images, you should use class
weights in your classification loss. (I am assuming that your model has a classification
“head” – with its own classification loss – that is somewhat separate from your detection
“head.” For example, perhaps your detector first detects a lesion that is then passed on
to a classifier that classifies it.)

A multi-class classification loss would typically be CrossEntropyLoss. In this case, you
could use its weight constructor argument to reweight the class imbalance when the
loss is computed. Assuming that my speculation about your model architecture is correct,
this would perform the reweighting on a lesion-by-lesion basis.

Best.

K. Frank

Aquafina · May 15, 2025, 10:36pm

Hello, thanks for your response. If it may be helpful, I am using torchvision.models.detection.fasterrcnn_resnet50_fpn as my model.

Do you mean I should ignore multi-lesion images when training? I think I have enough single-lesion images to make this happen.

I’m not exactly sure how the loss function is implemented for fasterrcnn_resnet50_fpn. Based on an example I’ve seen in torchvision documentation, they don’t explicitly define a loss function before the training loop. I’m assuming this model handles loss calculations internally (it’s “built-into” its implementation)?

Thanks for your patience. I’m a beginner to machine learning and I appreciate any constructive feedback!

KFrank · May 16, 2025, 10:39pm

Hi Aquafina!

No, I would train on all of the images, including multi-lesion images. (In general, you
don’t want to throw away any annotated ground-truth data – it’s generally hard to
come by.)

Assuming that multi-lesion images form a significant part of your training set (if not,
they won’t really confuse the class weightings, so you can train with them without
really worrying about them), I would look at the class counts across your entire training
set. and then use WeightedRandomSampler to more heavily weight single-lesion
images, as appropriate, so that your weighted class counts are approximately uniform.

Yes, I think you’re right about this.

I’ve never used fasterrcnn and am not knowledgeable about its details. However, poking
around a little bit, it appears, roughly speaking, that fasterrcnn detects lesion-A and then
separately detects lesion-B, and so on, rather than first detecting a lesion and then
classifying that lesion. So it doesn’t naturally use something like CrossEntropyLoss
as a classification loss.

Also, it does not appear that torchvision’s pre-packaged fasterrcnn exposes any way to
reweight classes in its loss function(s) (or otherwise in its training).

So using WeightedRandomSampler to reweight classes by sampling them more or less
frequently seems to be the way to go.

Assuming (as I believe is the case) that fasterrcnn detects a lesion-A in the upper left of
a single-lesion image more or less the same way it would detect that lesion in a two-lesion
image that also had a lesion-B in the lower right, then I think the scheme I outlined above
ought to work.

Best.

K. Frank

Aquafina · May 17, 2025, 12:02am

Hi, thanks again for your response.

Could you clarify how I would go about assigning weights to an image depending on whether its single or multi-lesion? Furthermore, how would I go about weighting single-lesion images more heavily?

From my understanding and previous tinkering, the weight tensor is of the same length as the dataset, and each element corresponds to the weight assigned to a given instance’s class or label. I guess this is where my confusion arises; for a multi-lesion image, there are multiple labels each with different frequencies and thus weights. So, I am unsure on how to assign a weight for a multi-lesion image.

KFrank · May 17, 2025, 4:30pm

Hi Aquafina!

Let me illustrate the scheme with a simplified example.

First let me note that it is perfectly reasonable to weight multi-lesion images with
varying weights. It’s just something of a pain in the neck to choose such weights
in a way that is useful. In this example, all multi-lesion images will be given the
same weight and we will rely on your statement that you have enough single-lesion
images of each lesion class to be able to compensate for the class imbalance by
reweighting just the single-lesion images.

Let’s say you have just three lesion classes, lesion-A, lesion-B, and lesion-C. Let’s
say that your training set consists of 30 lesion-A, 10 lesion-B, 5 lesion-C images. as
well as 5 multi-lesion images. Furthermore let’s say – just for simplicity – that in
aggregate the multi-lesion images contain 30 lesion-A’s, 10 lesion-B’s, and 5 lesion-C’s.
(So across your entire training set – both single- and multi-lesion images – you have
a total of 60 lesion-A’s, 20 lesion-B’s, and 10 lesion-C’s.)

We can compensate for the class imbalance by using the following image weights:
multi-lesion images all have weight 1, lesion-A images have weight 1, lesion-B images
have weight 5, and lesion-C images have weight 11.

Now the multi-lesion images contribute (30, 10, 5) to the weighted class counts. (These
are just the counts of the lesion classes in the multi-lesion images with weight 1.) The
30 lesion-A images – with weight 1 – contribute 30 to lesion-A weighted class count,
for a total lesion-A weighted class count of 60. With weight 5, the 10 lesion-B images
contribute 50 to the lesion-B weighted class count, for a total of 60. Lastly, with weight
11, the 5 lesion-C images contribute 55 to the lesion-C weighted class count, again for
a total of 60.

The basic idea is you just take however many of each lesion class you get from the
multi-lesion images (so that we don’t have to worry about how to weight them) and
use the single-lesion images – which you say you have – to “top up” the weighted
class counts for any underrepresented lesion classes by choosing the appropriate
weight for single-lesion images from any particular underrepresented class.

(Note, in this example, the chosen weights give us exactly-equal weighted class counts.
This is not necessary – they just have to be about the same. Without weights, you have
six times as many lesion-A’s as lesion-C’s – probably too much of an imbalance. You
have twice as many lesion-B’s as lesion-C’s – while that’s probably okay, it’s not ideal.
If you get your weighted class counts to be the same to 10% or 20%, that should be
fine – they don’t need to be exactly equal.)

Best.

K. Frank

Aquafina · May 18, 2025, 8:00pm

Hi K. Frank, thanks for the explanation. I will definitely give this idea a try.

I was also thinking about assigning an “averaged” weight to multi-lesioned images, but I’m not sure if this would work, or if it’s mathematically sound. Suppose we use the traditional method of 1/(class count) to calculate weights; so using your example, the weights for lesion A, B, and C would be 1/60, 1/20, and 1/10 respectively. As such, the single lesioned image weights are straightforward (a lesion A image would be assigned 1/60 etc). For multi lesioned images, I was thinking about averaging the corresponding weights for each individual lesion. So suppose you had an image with lesion A and B, then the overall weight of the image is the average of 1/60 and 1/20. I’m curious what your thoughts are about this?