I’m doing a project about semantic segmentation. Since I cannot find a good example for segmentation, my project cannot produce good accuracy. The following is some relative codes. criterion = nn.CrossEntropyLoss().cuda() image, target = image.cuda(), mask.cuda() image, target = Variable(image), Variable(target) output = model(image) _, pred = torch.max(output, dim=1) output = output.permute(0,2,3,1).contiguous() output = output.view(-1, output.size()[-1]) mask_label = target.view(-1) loss = criterion(output, mask_label)
image is BCHW and target is BHW. I wonder whether the codes is correct and anyone knows some function to calculate mean IoU in PyTorch?

I am training model with this project and will compare its’ result with model trained with caffe. Until now the loss is nearly the same with loss when training with caffe. It’s stable. Do you make a comparison with other DL framework? Maybe the reason for unstable loss and bad performance is dataset, CNN structure, or hyper parameters.

@EthanZhangYi I think last time I just simply run the script trainer.py to see the performance. I didn’t carefully check the codes. The dataset is VOC2012.
The output should like this. So you do change some model or codes?

Epoch [1/80] Iter [20/3000] Loss: 928.0042
Epoch [1/80] Iter [40/3000] Loss: 3225.1040
Epoch [1/80] Iter [60/3000] Loss: 3037.4116
Epoch [1/80] Iter [80/3000] Loss: 806.6054
Epoch [1/80] Iter [100/3000] Loss: 1905.5277
Epoch [1/80] Iter [120/3000] Loss: 13097.4932
Epoch [1/80] Iter [140/3000] Loss: 590.4274
Epoch [1/80] Iter [160/3000] Loss: 379.0482
Epoch [1/80] Iter [180/3000] Loss: 1181.2756
Epoch [2/80] Iter [20/3000] Loss: 305.0484
Epoch [2/80] Iter [40/3000] Loss: 1294.6436
Epoch [2/80] Iter [60/3000] Loss: 1791.2438
Epoch [2/80] Iter [80/3000] Loss: 682.8095
Epoch [2/80] Iter [100/3000] Loss: 1744.4493
Epoch [2/80] Iter [120/3000] Loss: 13163.7197
Epoch [2/80] Iter [140/3000] Loss: 587.6023

HI, @Zhengtian
I just reuse the loss in that project and train model with my own script and private data-set. So I only checked code loss.py. It worked correctly. Maybe there is something else wrong.

I rewrite the loss.py as a nn.Module. Hoping it’s helpful for you.

import torch.nn.functional as F
import torch.nn as nn
class CrossEntropy2d(nn.Module):
def __init__(self, size_average=True, ignore_label=255):
super(CrossEntropy2d, self).__init__()
self.size_average = size_average
self.ignore_label = ignore_label
def forward(self, predict, target, weight=None):
"""
Args:
predict:(n, c, h, w)
target:(n, h, w)
weight (Tensor, optional): a manual rescaling weight given to each class.
If given, has to be a Tensor of size "nclasses"
"""
assert not target.requires_grad
assert predict.dim() == 4
assert target.dim() == 3
assert predict.size(0) == target.size(0), "{0} vs {1} ".format(predict.size(0), target.size(0))
assert predict.size(2) == target.size(1), "{0} vs {1} ".format(predict.size(2), target.size(1))
assert predict.size(3) == target.size(2), "{0} vs {1} ".format(predict.size(3), target.size(3))
n, c, h, w = predict.size()
target_mask = (target >= 0) * (target != self.ignore_label)
target = target[target_mask]
predict = predict.transpose(1, 2).transpose(2, 3).contiguous()
predict = predict[target_mask.view(n, h, w, 1).repeat(1, 1, 1, c)].view(-1, c)
loss = F.cross_entropy(predict, target, weight=weight, size_average=self.size_average)
return loss

When I write this loss module, F.cross_entropy only support 1D case, therefore prediction of shape [N, C, H, W] is transposed to [N, H, W, C] and viewed as [NHW, C]