Implementing Loss Function for FCN


I am trying to implement a loss function for an FCN. My output is a tensor of shape (n, c, h, w). My target is of shape (h, w). I would like to calculate a loss between the output and the tensor but the problem is that I have a mask. There is only a certain portion of the image that I am interested in calculating the loss for. I am trying to achieve my goal by unwrapping the image channels into arrays and then applying the mask on it. When I do this, I receive an error:

RuntimeError: Assertion `cur_target >= 0 && cur_target < n_classes’ failed. at /pytorch/aten/src/THNN/generic/ClassNLLCriterion.c:93

Please see my code (There may be an easier way to do this as I am new to this):

def Loss(inp, target, mask):
    mask=torch.from_numpy(np.array(mask, dtype=np.uint8))
    target=target.contiguous().view(-1,1) #Flattening the Target Image
    mask = mask.contiguous().view(-1, 1)
    target = target[~mask] #Masking Target
    n, c, h, w = inp.size()
    inp1=np.zeros((target.shape[0],c)) #Creating new empty array with dimensions of (masked_region, c)
    for i in range(c):
        inp1[:,i]=inp[0,i,:,:].view(-1,1)[~mask] #Masking the input and filling in the array created
    log_p = F.log_softmax(inp1, dim=1)
    loss = criterion(log_p, target)
    return loss


Could you post the shapes of log_p and target?
Note that your target should also contain the batch dimension ([n, h, w]) and should contain class indices in the range [0, nb_classes-1] in the usual use case.
It looks like you are selecting all “valid” pixels using the mask and store them as a flattened version. I would assume that you have to select the corresponding targets in the same way so that the shapes match again.

However, you could probably also use the original output and target with a non-reducing criterion (pass reduction='none' tonn.NLLLoss`) and multiply it with your mask to zero out the unwanted pixel locations.

We could most likely avoid the for loop by using broadcasting, so could you also post the shape of your mask?

Hey so here’s the sizes of the variables:
inp getting passed in [1, 10, 256, 256] -> batch size 1, 10 classes, 256 x 256 image
target [256, 256]
mask [256, 256]
log_p after the masking operation: [28585, 10]
target after the masking operation: [28585]

My target contains class values between 0 and 9 for each pixel.

I’m confused when you say to pass the original output and target with a non-reducing criterion then zero out the mask. Isn’t the output of the loss function is a scalar value, how would I zero the mask?

The shapes of log_p and target look alright.
Could you check each target batch for its min and max value, as apparently some value is out of bounds before calling the criterion:

print(target.min(), target.max())

You’ll get a scalar values, since the default reduction is set to 'mean'.
If you pass 'none', you’ll get an output of the same shape as target:

output = torch.randn(10, 10 , requires_grad=True)
target = torch.randint(0, 10, (10,))

loss = F.cross_entropy(output, target, reduction='none')
> torch.Size([10])

Anyway, since your shapes seem to be OK, you could check the range of each target batch and find the batch which causes the trouble.