Change dataset from cityscapes to binary segmentation

rui_zhou · June 5, 2021, 9:40am

Hi, when I try to change the dataset Cityscapes to a binary segmentation, the cross_entropy loss seems not work well with the binary mask,

Blockquote
File “source_only.py”, line 126, in main
lr_scheduler, epoch, visualize if args.debug else None, args)
File “source_only.py”, line 188, in train
loss_cls_s = criterion(pred_s, label_s)
File “/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py”, line 889, in _call_impl
result = self.forward(*input, **kwargs)
File “/usr/local/lib/python3.7/dist-packages/torch/nn/modules/loss.py”, line 1048, in forward
ignore_index=self.ignore_index, reduction=self.reduction)
File “/usr/local/lib/python3.7/dist-packages/torch/nn/functional.py”, line 2693, in cross_entropy
return nll_loss(log_softmax(input, 1), target, weight, None, ignore_index, None, reduction)
File “/usr/local/lib/python3.7/dist-packages/torch/nn/functional.py”, line 2390, in nll_loss
ret = torch._C._nn.nll_loss2d(input, target, weight, _Reduction.get_enum(reduction), ignore_index)
RuntimeError: only batches of spatial targets supported (3D tensors) but got targets of dimension: 4

and if I change it to BCEloss,

Blockquote
File “/usr/local/lib/python3.7/dist-packages/torch/nn/modules/loss.py”, line 613, in forward
return F.binary_cross_entropy(input, target, weight=self.weight, reduction=self.reduction)
File “/usr/local/lib/python3.7/dist-packages/torch/nn/functional.py”, line 2755, in binary_cross_entropy
“Please ensure they have the same size.”.format(target.size(), input.size())
ValueError: Using a target size (torch.Size([2, 256, 512, 3])) that is different to the input size (torch.Size([2, 19, 256, 512])) is deprecated. Please ensure they have the same size.

it goes wrong with the channels, even I set the num_class from 19 to 2.

I have no idea, any guidance will be so helpful!

ptrblck · June 6, 2021, 11:18pm

nn.CrossEntropyLoss used for a multi-class segmentation use case expects a model output in the shape [batch_size, nb_classes, height, width] while the target should have the shape [batch_size, height, width] and contain class indices in the range [0, nb_classes-1] ([0, 1] in your case for a “binary multi-class segmentation”).
The first error is thus raised, since the target seems to have 4 dimensions.

On the other hand, nn.BCEWithLogitsLoss expects both the model output and target to have the same shape as [batch_size, nb_classes, height, width] and the target should contain floating point values in the range [0, 1] for each class (nb_classes would be 1 for a binary segmentation use case).

Based on the second error message it seems you are trying to pass color images in the channels-last memory format as the target tensors, which won’t work. In case my assumption is correct you would have to map the colors to class values first.

rui_zhou · June 9, 2021, 2:50pm

Hi，I change the map to

Blockquote
self.id_to_trainid = {
0: 0,
1: 1
}
self.trainid2name = {
0:“back”,
1:“file”,
}

but it seems the same problem…

Blockquote
File “train_src.py”, line 314, in
main()
File “train_src.py”, line 307, in main
model = train(cfg, args.local_rank, args.distributed)
File “train_src.py”, line 139, in train
loss = criterion(pred, src_label)
File “/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py”, line 889, in _call_impl
result = self.forward(*input, **kwargs)
File “/usr/local/lib/python3.7/dist-packages/torch/nn/modules/loss.py”, line 1048, in forward
ignore_index=self.ignore_index, reduction=self.reduction)
File “/usr/local/lib/python3.7/dist-packages/torch/nn/functional.py”, line 2693, in cross_entropy
return nll_loss(log_softmax(input, 1), target, weight, None, ignore_index, None, reduction)
File “/usr/local/lib/python3.7/dist-packages/torch/nn/functional.py”, line 2390, in nll_loss
ret = torch._C._nn.nll_loss2d(input, target, weight, _Reduction.get_enum(reduction), ignore_index)
RuntimeError: 1only batches of spatial targets supported (3D tensors) but got targets of size: : [2, 3, 720, 1280]

ptrblck · June 9, 2021, 6:10pm

I don’t know where this mapping is used, but based on the error message your target still contains 4 dimensions (now the channel dimension seems to be in dim1).
Could you show how you’ve tried to map the color values to the class indices in the target tensor?

rui_zhou · June 10, 2021, 1:39pm

Thanks for reply, the map is in this way:

class GTA5DataSet(data.Dataset):
def init(self,
max_iters=None,
num_classes=2,
ignore_label=255,
debug=False,):

    self.split = split
    self.NUM_CLASS = num_classes
    self.data_root = data_root
    self.data_list = []
    
    if max_iters is not None:
        self.label_to_file, self.file_to_label = pickle.load(open(osp.join(data_root, "gtav_label_info.p"), "rb"))
        self.img_ids = []
        SUB_EPOCH_SIZE = 3000
        tmp_list = []
        ind = dict()
        for i in range(self.NUM_CLASS):
            ind[i] = 0
        for e in range(int(max_iters/SUB_EPOCH_SIZE)+1):
            cur_class_dist = np.zeros(self.NUM_CLASS)
            for i in range(SUB_EPOCH_SIZE):
                if cur_class_dist.sum() == 0:
                    dist1 = cur_class_dist.copy()
                else:
                    dist1 = cur_class_dist/cur_class_dist.sum()
                w = 1/np.log(1+1e-2 + dist1)
                w = w/w.sum()
                c = np.random.choice(self.NUM_CLASS, p=w)

                if ind[c] > (len(self.label_to_file[c])-1):
                    np.random.shuffle(self.label_to_file[c])
                    ind[c] = ind[c]%(len(self.label_to_file[c])-1)

                c_file = self.label_to_file[c][ind[c]]
                tmp_list.append(c_file)
                ind[c] = ind[c]+1
                cur_class_dist[self.file_to_label[c_file]] += 1

        self.img_ids = tmp_list

    if max_iters is not None:
        self.data_list = self.data_list * int(np.ceil(float(max_iters) / len(self.data_list)))
    
    print('length of gta5', len(self.data_list))

    self.id_to_trainid = {0: 0, 1: 1}
    self.trainid2name = {
        0:"back",
        1:"file",
         }
    self.transform = transform
    self.ignore_label = ignore_label
    self.debug = debug

def __len__(self):
    return len(self.data_list)

def __getitem__(self, index):
    if self.debug:
        index = 0
    datafiles = self.data_list[index]

    # re-assign labels to match the format of Cityscapes
    label_copy = self.ignore_label * np.ones(label.shape, dtype=np.float32)
    for k, v in self.id_to_trainid.items():
        label_copy[label == k] = v
    label = Image.fromarray(label_copy)

    if self.transform is not None:
        image, label = self.transform(image, label)

    return image, label, name