U-Net can't perfectly segment image

If you’re calculating each image separately in a loop, you could do:

... 
        return img_x, img_y, negatives, positives

negs = 0
posv = 0
for i in range(len(train_y)):
    image, mask, negatives, positives = dataset[i]
    negs += negatives.item()
    posv += positives.item()

pos_weight = negs/posv

thanks! here’s my updated pos weight

Pos_weight:  124.26197203506484

That looks about right.

so now i just include that to my loss function?

criterion = nn.BCEWithLogitsLoss(pos_weight=pos_weight)

The docs state:

  • pos_weight (Tensor, optional) – a weight of positive examples. Must be a vector with length equal to the number of classes.

https://pytorch.org/docs/stable/generated/torch.nn.BCEWithLogitsLoss.html

That seems to suggest you won’t need the torch.sum, and should only add on the batch dimension.

# Calculate pos_weight
        negatives = img_y == 0
        positives = img_y == 1
...

negs = torch.zeros((1, 256, 256)) #make sure the size here matches the dims and sizes for 1 image
posv = torch.zeros((1, 256, 256))
for i in range(len(train_y)):
    image, mask, negatives, positives = dataset[i]
    negs += negatives #if this has a batch dim, just append negatives.sum(dim=1)
    posv += positives #see above note

pos_weight = negs/posv

See the comments, too.

On second thought, in light of the transforms being used, a per pixel pos_weight may have some unintended results. For example, if a given pixel had no positive ground truths in the entire set, it would be divide by zero(quite plausible given this set). That could be dealt with by setting pos_weight[torch.isinf(pos_weight)|torch.isnan(pos_weight)] = 124.26.

Another example, if a given pixel had only one positive ground truth, so pos_weight=58/1 there, but due to the transforms, it ends up seeing 2 or 3 positive ground truths, that might unbalance the training, as well.

To deal with this, you could first loop through that class-specific adding function 100 or so times on all the training set, but while applying the transforms you intend to apply, before dividing for the pos_weight tensor. That would have the benefit of statistically capturing what will likely occur during training. And then just apply the isinf filter to set any “dead” pixels to the mean pos_weight you found earlier.

so i just need to set it like this?

criterion = nn.BCEWithLogitsLoss(pos_weight[torch.isinf(pos_weight)|torch.isnan(pos_weight)] = 124.26)

Call that statement on a separate line. It’s just saying “Any elements in pos_weight with inf or nan set equal to 124.26”. Once those are set, you can pass in just pos_weight.

mm i’ve tried to train it, and it the result get worse…train and test loss seems high too

Your issue originally was the model was guessing more negatives than the ground truths typically had. Now it seems to be going too far in the opposite direction. Means you’re on the right track.

Seems you just need to tweak this value some. What was your final code for calculating the pos_weight?

here’s my code, is the error because i am still using sum?

total_positive_pixels = 0
total_negative_pixels = 0

for x, y in data_loader:
    positive_pixels_batch = (y == 1).sum().item()
    negative_pixels_batch = (y == 0).sum().item()
    total_positive_pixels += positive_pixels_batch
    total_negative_pixels += negative_pixels_batch

pos_weight = torch.tensor(total_negative_pixels / total_positive_pixels)
print(pos_weight)

Please see the comments here:

And here:

You could do an average of all elements, and apply that as one positional weight(as in your current code), but the distribution of positive pixels are centralized and not evenly distributed.

So that means if a given pixel has 5/59 positive examples in the training set(1/12~), but the pos_weight is 124, the positive representation gets adjusted 124 * 1/12 = 10.5 Meaning it’s the same as if the model sees 10.5 times as many positive values on that pixel, way too many in the opposite direction. We want that to be 1 or close to 1 for every pixel.

The above two comments combined address this by giving you a method to find a statistical pos_weight value per pixel, based on the given transforms being used. That will then create a tensor object to pass in as the pos_weight.

ah okay, sorry for my misunderstanding, here’s my code

dataset = ProcessDataset(train_x, train_y)

negs = torch.zeros((1, 256, 256)) #make sure the size here matches the dims and sizes for 1 image
posv = torch.zeros((1, 256, 256))
for i in range(len(train_y)):
    image, mask, negatives, positives = dataset[i]
    negs += negatives.sum(dim=1) 
    posv += positives.sum(dim=1) 

pos_weight = negs/posv
print(pos_weight)

and the result is

tensor([[[     inf,      inf,      inf,  ..., 158.5177, 164.4146, 170.7215],
         [     inf,      inf,      inf,  ..., 158.5177, 164.4146, 170.7215],
         [     inf,      inf,      inf,  ..., 158.5177, 164.4146, 170.7215],
         ...,
         [     inf,      inf,      inf,  ..., 158.5177, 164.4146, 170.7215],
         [     inf,      inf,      inf,  ..., 158.5177, 164.4146, 170.7215],
         [     inf,      inf,      inf,  ..., 158.5177, 164.4146, 170.7215]]])

Getting closer, but the code still does not address the following points:

  1. Get a statistical distribution of pos_weights based on the transforms that will be applied during training.
  2. Replace any inf and NaN values with some appropriate pos_weight, just in case that pixel gets triggered as an outlier from the transforms during a training round.

In order to achieve 1, you can use something like:

for k in range(100):
    for i in range(len(train_y)):
        ...

You’ll need to also call your actual train data loader, so you can get a distribution of the transformed outputs:

negs = torch.zeros((1, 256, 256)) #make sure the size here matches the dims and sizes for 1 image
posv = torch.zeros((1, 256, 256))

for k in range(100):
    for i, (data, target) in enumerate(train_loader):
        negatives = target == 0
        positives = target == 1
        negs += negatives.sum(dim=0)
        posv += positives.sum(dim=0)

pos_weight = negs/posv

And for addressing point number 2, see here:

wait i got confused, so the pos weight is calculated for the transformed dataset right?

I don’t know of any standard approach to this type of problem. If you’ve found something that works and gives you the desired performance, then additional tweaks, like the one suggested, may be irrelevant.

Here was the reasoning for including transformed data in the pos_weight:

It should represent the data the model is trained on. But seeing as your dataset is very small, additional considerations may be necessary.

ah thankyou for your suggestion, but i think it’s not a good solution in my case, i’d try something else, thanks again!