FocalTversky loss function expected input/output size?

I am currently working on a semantic segmentation model and I am trying out a different loss function is this case. The loss function i am using is FocalTversky Loss function.

‘’’
class FocalTverskyLoss(nn.Module):
def init(self):
super().init()

def forward(self, preds, target, alpha=0.7, beta = 0.3, epsilon=1e-6, gamma=3):
    
    preds = torch.sigmoid(preds) 
    
    #flatten label and preds tensors
    preds = preds.reshape(-1)
    target = target.reshape(-1)
    
    #True Positives, False Positives & False Negatives 
    TP = (preds * target).sum()
    FP = ((1-target) * preds).sum()
    FN = (target * (1-preds)).sum()
    Tversky = (TP + epsilon)/(TP + alpha*FP + beta*FN + epsilon)
    FocalTversky = (1 - Tversky)**gamma
    
    return FocalTversky 

‘’’

However when I run it in my code like this

outputs = self.model(images)
loss = self.loss_function(preds=outputs,target=labels).to(device)
train_loss += loss.item()
loss.backward()
self.optimizer.step()
self.optimizer.zero_grad()

it gives this error,


RuntimeError Traceback (most recent call last)
in
1 imgdir = ‘Z:\HuaSheng\datasets\sen12msgrss\DFC_Public_Dataset’
----> 2 trainer(imgdir= imgdir, classes = list(range(0,10)) ,fsave=‘Rip_chkpt_FTL.pth’, reloadmode=‘same’, checkpoint=None, num_epochs = 110)

in init(self, imgdir, classes, num_epochs, fsave, reloadmode, checkpoint, bs, report)
59 for self.epoch in range(self.num_epochs):
60 print(’\n’+’*'6+‘TRAIN FOR ONE EPOCH’+’'6)
—> 61 train_loss = self.train()
62
63 print(’\n’+’
6+‘EVAL FOR ONE EPOCH’+’’*6)

in train(self)
163 self.writer.flush()
164
–> 165 loss = self.loss_function(preds=outputs,target=labels.view(1, -1)).to(device)
166 #weight=torch.FloatTensor([0.,2.5,0.,1.5,2.1,0.3,4.5,0.,4.5,0.]).to(device)
167 train_loss += loss.item()

~\anaconda3\envs\pytLocal38\lib\site-packages\torch\nn\modules\module.py in _call_impl(self, *input, **kwargs)
725 result = self._slow_forward(*input, **kwargs)
726 else:
–> 727 result = self.forward(*input, **kwargs)
728 for hook in itertools.chain(
729 _global_forward_hooks.values(),

in forward(self, preds, target, alpha, beta, epsilon, gamma)
57
58 #True Positives, False Positives & False Negatives
—> 59 TP = (preds * target).sum()
60 FP = ((1-target) * preds).sum()
61 FN = (target * (1-preds)).sum()

RuntimeError: The size of tensor a (5017600) must match the size of tensor b (501760) at non-singleton dimension 0

My input size for my preds and target is torch.Size([10, 10, 224, 224]), torch.Size([1, 501760])

Is there a way I can make my target dim 0 into 10? Or is that even right for me to do when I am training it? Thanks alot for your help

Could you explain what the shapes of preds and target represent?
Currently it seems preds uses a batch size of 10, while you have a single target, which will then raise this error after flattening these tensors.

PS: you can post code snippets by wrapping them into three backticks ```, which makes debugging easier. :wink:

Hi @ptrblck, sorry for the poor posting format! haha The target represent the labels of the image and the prediction is the output after fitting in the model. The image I am working on right now consist of 13 channel images with 10 classes inside. The chip size of the image is 224. Where every pixel in the image contains a classes used for semantic segmantation modelling.

However, I have come across that using this loss function is not suitable as it is a multiclassifcation problem where as this focaltverskyloss is based off binarycrossentropy loss. So I think it may not be suitable to use this loss

Thanks for the update. While your use case makes sense, it’s still unclear why the batch size of the output images is 10 while the target has only a single sample.
If your target is supposed to contain 10 values, then I guess a reshaping operation might be wrong on this tensor.