Shape for multiple-channel, 2D mask weights using BCEWithLogitsLoss

morepenguins · July 27, 2022, 8:04pm

I have a set of 256x256 images that are each labeled with nine, binary 256x256 masks. I am trying to calculate the pos_weight in order to weight the BCEWithLogitsLoss.

The shape of my masks tensor is:

tensor([1000, 9, 256, 256])

Where 1000 is the number of training images, 9 is the number of mask channels (all encoded to 0/1), and 256 is the size of each image side.

To calculate pos_weight, I have summed the zeros in each mask, and divided that number by the sum of all of the ones in each mask (following the advice suggested here.):

(masks[:,channel,:,:]==0).sum()/masks[:,channel,:,:].sum()

Calculating the weight for every mask channel provides a tensor with the shape of tensor([9]), which seems intuitive to me, since I want a pos_weight value for each of the nine mask channels. However when I try to fit my model, I get the following error message:

RuntimeError: The size of tensor a (9) must match the size of tensor b (256) at non-singleton dimension 3

This surprises me because the error message seems to suggest that the weights need to be the size of the side of the images, but not the number of mask channels. What shape should pos_weight be and how do I specify that it should be providing weights for the mask channels?