1only batches of spatial targets supported (non-empty 3D tensors) but got targets of size: : [1, 3, 375, 1242]

If the mapping is not given in the dataset (or the hosting website) somewhere, you could just create your own mapping like I did for the medical segmentation dataset.
Which dataset are you using? Maybe the mapping was already created somewhere.

Thank you for your reply, could you give me the link of your example for the medical segmentation dataset?
Here is the dataset I used:

Sure! @Neda used the TEM dataset from here. I just downloaded it and created a pseudo-mapping using all unique values in the images, since I couldn’t find an official color-to-label mapping.

I’m just wondering how you create your pseudo-mapping ? You print all values in the images and observe the difference? When I try to print the image, It only shows me very small part of the data, and normally for one image, the data is very large, so could you tell me how you can observe all different unique values in one image and then create the mapping.

You could try the following code using torch.unique and the dim argument:

image = Image.open(path)
im_arr = np.array(image)
im_tensor = torch.from_numpy(im_arr)
im_tensor = im_tensor.permute(2, 0, 1)
unique_colors = torch.unique(im_tensor.view(im_tensor.size(0), -1), dim=1)

Let me know, if that works for you!

1 Like

Thank you very much :smile:, this works!
Here is my code:

colors = torch.tensor([])
for i in range(len(trainmask)):
    im_str = trainimage[i]
    im_arr = io.imread(os.path.join(train_mask, im_str))
    im_tensor = torch.from_numpy(im_arr)
    im_tensor = im_tensor.permute(2,0,1)
    unique_colors = torch.unique(im_tensor.view(im_tensor.size(0), -1)).type(torch.FloatTensor)
    colors = torch.cat((colors, unique_colors))
    colors = torch.unique(colors)
print(colors)

and here is the result:

tensor([  64.,  150.,  107.,  110.,   20.,  220.,  119.,  180.,   11.,
         152.,    0.,  160.,   60.,  102.,  230.,   81.,  255.,  120.,
          30.,  100.,   80.,  140.,  251.,  142.,   74.,  111.,   90.,
          35.,  156.,  190.,  170.,  153.,   32.,  250.,  244.,  165.,
         128.,   70.,  130.,  232.])

There are 40 unique colour in my mask image. It means that I have 40 unique class in my dataset.

I’m now wondering how to find the relationship between these colour value and the class type.
Do you have some suggestions?

I’m not that sure about the last unique call.
unique_colors should be a [number_of_colors, 3] tensor so that each unique color contains the RGB values. Currently you are calling unique again on it, so that the RGB color is not usable anymore.

Anyway, if you’ve fixed this and can’t find any official mapping, I would just create a dict with your own mapping and stick to it. In the end you just need class indices for the different colors. It doesn’t matter if car has the class index 0 or 40 as long as you don’t mix it up. :wink:

1 Like

Do you mean I should specify the dimension of the unique to get [number_of_colors, 3]?
I try to specify the dimension of the unique, when I add dim=1, I get this error:

unique() got an unexpected keyword argument 'dim'

my torch version is 0.4.1
Do you know what’s wrong?:smiley:

The dim argument was added after the 0.4.1 release, so I would recommend to update to the latest stable release (1.0) to get this and other useful features. :wink:

1 Like

Problem solved, here is my code:

colors_all = torch.tensor([])
for i in range(len(mask_list)):
    mask_str = mask_list[i]
    mask_arr = io.imread(os.path.join(mask_dir, mask_str))
    mask_tensor = torch.from_numpy(mask_arr)
    mask_tensor = mask_tensor.permute(2,0,1)
    
#     print(mask_tensor.shape)
    colors = torch.unique(mask_tensor.view(mask_tensor.size(0), -1),dim=1)
    colors = colors.permute(1,0).type(torch.FloatTensor) 
    colors_all = torch.cat((colors_all, colors))
    colors_unique = torch.unique(colors_all, dim = 0)
print(colors_unique.shape)
print(colors_unique)

Here is the results:

torch.Size([29, 3])
tensor([[  0.,   0.,   0.],
        [  0.,   0.,  70.],
        [  0.,   0.,  90.],
        [  0.,   0., 110.],
        [  0.,   0., 142.],
        [  0.,   0., 230.],
        [  0.,  60., 100.],
        [  0.,  80., 100.],
        [ 70.,  70.,  70.],
        [ 70., 130., 180.],
        [ 81.,   0.,  81.],
        [102., 102., 156.],
        [107., 142.,  35.],
        [111.,  74.,   0.],
        [119.,  11.,  32.],
        [128.,  64., 128.],
        [150., 100., 100.],
        [150., 120.,  90.],
        [152., 251., 152.],
        [153., 153., 153.],
        [180., 165., 180.],
        [190., 153., 153.],
        [220.,  20.,  60.],
        [220., 220.,   0.],
        [230., 150., 140.],
        [244.,  35., 232.],
        [250., 170.,  30.],
        [250., 170., 160.],
        [255.,   0.,   0.]])

There are in total 29 classes :rofl:

1 Like

@ptrblck I have one question concerning the mapping, I have read your mapping example, which is for example:
85:0
In my case, I have 29 unique classes in total, and my mask is size 3x375x1242, after these code below:

mask_tensor = mask_tensor.permute(2,0,1)
mask_tensor = mask_tensor.view(mask_tensor.size(0), -1).permute(1,0)

I get the mask size : torch.Size([465750, 3])
so the unique color is 1 x 3, mask is 465750 x 3 now.
the mapping is :

self.mapping = {
        torch.tensor([  0,   0,   0], dtype=torch.uint8):0,
        torch.tensor([  0,   0,  70], dtype=torch.uint8):1,
        torch.tensor([  0,   0,  90], dtype=torch.uint8):2,
        torch.tensor([  0,   0, 110], dtype=torch.uint8):3,
        torch.tensor([  0,   0, 142], dtype=torch.uint8):4,
        torch.tensor([  0,   0, 230], dtype=torch.uint8):5,
        torch.tensor([  0,  60, 100], dtype=torch.uint8):6,
        torch.tensor([  0,  80, 100.], dtype=torch.uint8):7,
        torch.tensor([ 70,  70,  70], dtype=torch.uint8):8,
        torch.tensor([ 70, 130, 180], dtype=torch.uint8):9,
        torch.tensor([ 81,   0,  81], dtype=torch.uint8):10,
        torch.tensor([102, 102, 156], dtype=torch.uint8):11,
        torch.tensor([107, 142,  35], dtype=torch.uint8):12,
        torch.tensor([111,  74,   0], dtype=torch.uint8):13,
        torch.tensor([119,  11,  32], dtype=torch.uint8):14,
        torch.tensor([128,  64, 128], dtype=torch.uint8):15,
        torch.tensor([150, 100, 100], dtype=torch.uint8):16,
        torch.tensor([150, 120,  90], dtype=torch.uint8):17,
        torch.tensor([152, 251, 152], dtype=torch.uint8):18,
        torch.tensor([153, 153, 153], dtype=torch.uint8):19,
        torch.tensor([180, 165, 180], dtype=torch.uint8):20,
        torch.tensor([190, 153, 153], dtype=torch.uint8):21,
        torch.tensor([220,  20,  60], dtype=torch.uint8):22,
        torch.tensor([220, 220,   0], dtype=torch.uint8):23,
        torch.tensor([230, 150, 140], dtype=torch.uint8):24,
        torch.tensor([244,  35, 232], dtype=torch.uint8):25,
        torch.tensor([250, 170,  30], dtype=torch.uint8):26,
        torch.tensor([250, 170, 160], dtype=torch.uint8):27,
        torch.tensor([255,   0,   0], dtype=torch.uint8):28
        }
def mask_to_class(self, mask):
        for k in self.mapping:
#             print(k.dtype)
#             print(mask.dtype)
            mask[mask==k] = self.mapping[k]
        return mask
def __getitem__(self, idx):
        img_list = os.listdir(img_dir)
        mask_list = os.listdir(mask_dir)
        
        img_str = img_list[idx]
        img_arr = io.imread(os.path.join(img_dir, img_str))
        img_tensor = torch.from_numpy(img_arr)
        img_tensor = img_tensor.permute(2,0,1)
        
        mask_str = mask_list[idx]
        mask_arr = io.imread(os.path.join(mask_dir, mask_str))
        mask_tensor = torch.from_numpy(mask_arr)
        mask_tensor = mask_tensor.permute(2,0,1)
        mask_tensor = mask_tensor.view(mask_tensor.size(0), -1).permute(1,0)
        print(mask_tensor)
        print(mask_tensor.shape)
  
        mask_tensor = self.mask_to_class(mask_tensor)
        print(mask_tensor.shape)
        sample = {'image':img_tensor, 'mask':mask_tensor}
        
        

after the mapping, the mask is still the size 465750 x 3.
Here is the example of the mask values before and after mapping:

tensor([[ 70, 130, 180],
        [ 70, 130, 180],
        [ 70, 130, 180],
        ...,
        [244,  35, 232],
        [244,  35, 232],
        [244,  35, 232]], dtype=torch.uint8)

tensor([[ 8,  9,  9],
        [ 8,  9,  9],
        [ 8,  9,  9],
        ...,
        [25, 25, 25],
        [25, 25, 25],
        [25, 25, 25]], dtype=torch.uint8)


I don’t understand why

  1. In the mapping, the 1x3 tensor is mapped to an integer, why the mapped mask size doesn’t change.
  2. If I get the mapped mask value above, is it correct?
  3. If it’s correct, what are the values in the mapped mask represent? For example, in the first [8,9,9] tensor, the 8 is a class , 9 is a class. I’m confused.

Could you give me some explanations?

Do you have the Cityscapes dataset ?

Hi, I use the KITTI data set KITTI semantic segmentation datasets.

Thank you, do you have a email ?

yes.
xysong@ntu.edu.sg

hi, can you explain how the mapping work with you ?

@ptrblck I have the same error, and I’m mapping the values to class labels, which are either 0, or 1.
Could you please help me?

Could you post a code snippet to reproduce this issue?
Interesting would be the input, output, their shapes, min/max values and the criterion you are using.

I’ve found the issue. because of my problem discussed in another post, I’m trying different networks implemented by others to see where my problem is.
@ptrblck
This is the link to my project: https://github.com/noornk/U-Net
My images are mammographies, and I’ve produced png’s for mask, from xml files.
I have problem visualizing the result for test. Although because of this problem, I’m not sure if my U-Net is working well or not.

Hi, I got a similar error. I am doing image segmentation now, there are 4 classes for pixels. So, originally the shape of my mask is [-1, 32, 32, 4]. Then I map the mask into [-1, 32, 32]. But I got another error, that is:

RuntimeError: input and target batch or spatial sizes don't match: target [128 x 32 x 32], input [128 x 32 x 32 x 4] at /pytorch/aten/src/THCUNN/generic/SpatialClassNLLCriterion.cu:23

Here is my code:

criterion = nn.CrossEntropyLoss().cuda()
optimizer = optim.Adam(model.parameters())
for batch_idx, (data, target) in enumerate(train_loader,1):
        if args.cuda:
            data, target = data.cuda(), target.cuda()
        optimizer.zero_grad()
        print(type(data))
        output = model(data)
        loss = criterion(output, target.long())
        loss.backward()
        optimizer.step()

The shape of output is [-1, 32, 32, 4]
How can I solve this error? Help :joy: