Right way to make one hot encoding for segmentation?

(John1231983) #1

I have a target with size of NxHxW, where N is batch size. I want to make one hot encoding with the output size of NxCxHxW for image segmentation.

def one_hot(targets):    
    targets_extend.unsqueeze_(1) # convert to Nx1xHxW
    one_hot = torch.cuda.FloatTensor(targets_extend.size(0), C, targets_extend.size(2), targets_extend.size(3)).zero_()
    one_hot.scatter_(1, targets_extend, 1) 
    return one_hot

In the above code, I have used targets_extend=targets.clone(), because the targets will be used in the cross-entropy loss (only allows the size of NxHxW, instead of Nx1xHxW). More clears, the output of one hot encoding is used for BCE loss, while the targets used in the cross-entropy loss. Is my way (targets_extend=targets.clone()) correct in the making one hot encoding? Can I use targets_extend=targets.detach()? Thanks


I would stick with .clone, since .detach still uses the same underlying data.
Your inplace unsqueeze_ will therefore also unsqueeze targets:

x = torch.ones(10)
y = x.detach()
> torch.Size([1, 10])
> torch.Size([1, 10])

May I ask, why you need the BCELoss?
It doesn’t seem like you have a multi-label setup or are you manipulating the one_hot target somehow afterwards?

(John1231983) #3

Thanks. I used it for computing the adversarial loss of multiple class ( likes 21 classes in voc). Meanwhile, the cross entropy is for segmentation loss. As I mentioned, the cross entropy expects the targets size of NxHxW, while BCELoss is NxCxHxW. So, your answer is using .clone, Am I right?


Yes, I would use clone and provide two different targets for the different criteria.