Issues with nn.NLLLoss2d for semantic segmentaion


I’ve been trying to implement FCN-32 by Shelhamer et al in PyTorch. I chose nn.NLLLoss2d as the loss metric after I get the upsampled output. To upsample I use: nn.ConvTranspose2d(,, 64, stride=32, bias=False) . However, using the loss criterion with this gives me a spatial dimension mismatch error. For example, I’m using an image of 1x3x375x500 and my network is producing the output of size 1x21x320x448. If someone could point me to what I’m doing wrong and/or suggest a better multinomial cross entropy loss function, it’ll be much appreciated.


nn.NLLLoss2d() requires a image of size N X C X H x W and the target/label of size NxHxW. I suspect your target/label is of size Nx1xHxW. Use torch.squeeze() to squeeze the 1 dimension