RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 1. Got 170 and 171 in dimension 3 at /pytorch/aten/src/THC/generic/THCTensorMath.cu:111

mhusseinsh · August 30, 2018, 8:07am

I am receiving this error

File "/erfnet_pytorch-master/train/erfnet.py", line 20, in forward
 output = torch.cat([self.conv(input), self.pool(input)], 1)
RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 1. Got 170 and 171 in dimension 3 at /pytorch/aten/src/THC/generic/THCTensorMath.cu:111

And this is the part of the code which is giving the error

class DownsamplerBlock (nn.Module):
    def __init__(self, ninput, noutput):
        super(DownsamplerBlock, self).__init__()

        self.conv = nn.Conv2d(ninput, noutput-ninput, (3, 3), stride=2, padding=1, bias=True)
        self.pool = nn.MaxPool2d(2, stride=2)
        self.bn = nn.BatchNorm2d(noutput, eps=1e-3)

    def forward(self, input):
        print([self.conv(input).size(), self.pool(input).size()])
        output = torch.cat([self.conv(input), self.pool(input)], 1)
        output = self.bn(output)
        return F.relu(output)

The output of the print line is:

[(1, 13, 256, 341), (1, 3, 256, 341)]
[(1, 48, 128, 171), (1, 16, 128, 170)]

I see that there is a mismatch in the last dimension, 170 and 171 … but I don’t know why

ptrblck · August 30, 2018, 11:17am

Could you try to set ceil_mode=True for your nn.MaxPool2d layer and try it again?

mhusseinsh · August 30, 2018, 11:43am

Thank you … It solved the error
But can you explain for me what does this flag/option does ? like why in the beginning I had a mismatch in dimensions, and how this is solved ?

Another thing, I got now a different error


Traceback (most recent call last):
  File "main.py", line 506, in <module>
    main(parser.parse_args())
  File "main.py", line 460, in main
    model = train(args, model, True) #Train encoder
  File "main.py", line 232, in train
    loss = criterion(outputs, targets[:, 0])
  File "/home/.local/lib/python2.7/site-packages/torch/nn/modules/module.py", line 491, in __call__
    result = self.forward(*input, **kwargs)
  File "main.py", line 82, in forward
    return self.loss(torch.nn.functional.log_softmax(outputs, dim=1), targets)
  File "/home/.local/lib/python2.7/site-packages/torch/nn/modules/module.py", line 491, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/.local/lib/python2.7/site-packages/torch/nn/modules/loss.py", line 193, in forward
    self.ignore_index, self.reduce)
  File "/home/.local/lib/python2.7/site-packages/torch/nn/functional.py", line 1334, in nll_loss
    return torch._C._nn.nll_loss2d(input, target, weight, size_average, ignore_index, reduce)
RuntimeError: input and target batch or spatial sizes don't match: target [5 x 64 x 85], input [5 x 13 x 64 x 86] at /pytorch/aten/src/THCUNN/generic/SpatialClassNLLCriterion.cu:24

ptrblck · August 30, 2018, 11:52am

The reason is, that your conv layer resamples the input differently than your pooling layer.
While both should halve the input in the spatial dimensions, the pool layer will use the floor operation by default, resulting in floor(341/2)=170 for the width. Setting ceil_mode=True will instead return 171 which matches the conv output.

Your new error seems to be related to the same issue. The target is a column smaller than the output.
Usually it’s simpler to use base2 sizes, but if that’s not possible in your use case, you would have to check, which operation creates the size mismatch.

mhusseinsh · August 30, 2018, 11:58am

the targets simply are the ground truth labels, and the input, is the output of the model

mhusseinsh · August 30, 2018, 12:03pm

This is how the Encoder is working

class Encoder(nn.Module):
    def __init__(self, num_classes):
        super(Encoder, self).__init__()
        self.initial_block = DownsamplerBlock(3,16)

        self.layers = nn.ModuleList()

        self.layers.append(DownsamplerBlock(16,64))

        for x in range(0, 5):    #5 times
           self.layers.append(non_bottleneck_1d(64, 0.03, 1)) 

        self.layers.append(DownsamplerBlock(64,128))

        for x in range(0, 2):    #2 times
            self.layers.append(non_bottleneck_1d(128, 0.3, 2))
            self.layers.append(non_bottleneck_1d(128, 0.3, 4))
            self.layers.append(non_bottleneck_1d(128, 0.3, 8))
            self.layers.append(non_bottleneck_1d(128, 0.3, 16))

        #Only in encoder mode:
        self.output_conv = nn.Conv2d(128, num_classes, 1, stride=1, padding=0, bias=True)

    def forward(self, input, predict=False):
        output = self.initial_block(input)
        print(output.size())
        for layer in self.layers:
            output = layer(output)
            print(output.size())
        exit()
        if predict:
            output = self.output_conv(output)

        return output

These are the outputs after each layer:

(1, 16, 256, 341)
(1, 64, 128, 171)
(1, 64, 128, 171)
(1, 64, 128, 171)
(1, 64, 128, 171)
(1, 64, 128, 171)
(1, 64, 128, 171)
(1, 128, 64, 86)
(1, 128, 64, 86)
(1, 128, 64, 86)
(1, 128, 64, 86)
(1, 128, 64, 86)
(1, 128, 64, 86)
(1, 128, 64, 86)
(1, 128, 64, 86)
(1, 128, 64, 86)
(1, 13, 64, 86)

And the whole downsampler:

class DownsamplerBlock (nn.Module):
    def __init__(self, ninput, noutput):
        super(DownsamplerBlock, self).__init__()

        self.conv = nn.Conv2d(ninput, noutput-ninput, (3, 3), stride=2, padding=1, bias=True)
        self.pool = nn.MaxPool2d(2, stride=2, ceil_mode=True)
        self.bn = nn.BatchNorm2d(noutput, eps=1e-3)

    def forward(self, input):
        #transforms.RandomCrop(224)
        output = torch.cat([self.conv(input), self.pool(input)], 1)
        output = self.bn(output)
        return F.relu(output)
    

class non_bottleneck_1d (nn.Module):
    def __init__(self, chann, dropprob, dilated):        
        super(non_bottleneck_1d, self).__init__()

        self.conv3x1_1 = nn.Conv2d(chann, chann, (3, 1), stride=1, padding=(1,0), bias=True)

        self.conv1x3_1 = nn.Conv2d(chann, chann, (1,3), stride=1, padding=(0,1), bias=True)

        self.bn1 = nn.BatchNorm2d(chann, eps=1e-03)

        self.conv3x1_2 = nn.Conv2d(chann, chann, (3, 1), stride=1, padding=(1*dilated,0), bias=True, dilation = (dilated,1))

        self.conv1x3_2 = nn.Conv2d(chann, chann, (1,3), stride=1, padding=(0,1*dilated), bias=True, dilation = (1, dilated))

        self.bn2 = nn.BatchNorm2d(chann, eps=1e-03)

        self.dropout = nn.Dropout2d(dropprob)
        

    def forward(self, input):

        output = self.conv3x1_1(input)
        output = F.relu(output)
        output = self.conv1x3_1(output)
        output = self.bn1(output)
        output = F.relu(output)

        output = self.conv3x1_2(output)
        output = F.relu(output)
        output = self.conv1x3_2(output)
        output = self.bn2(output)

        if (self.dropout.p != 0):
            output = self.dropout(output)
        
        return F.relu(output+input)    #+input = identity (residual connection)

mhusseinsh · August 30, 2018, 12:04pm

I receive the error upon the execution of this line
loss = criterion(outputs, targets[:, 0])
where the original targets size is (1, 1, 64, 85)

ptrblck · August 30, 2018, 12:49pm

How did you resize the target?
Was is originally in that size? If so, did you upsample your input image?

mhusseinsh · August 30, 2018, 12:51pm

target = Resize(int(self.height/8), Image.NEAREST)(target)

mhusseinsh · August 30, 2018, 12:53pm

class MyCoTransform(object):
    def __init__(self, enc, augment=True, height=512):
        self.enc=enc
        self.augment = augment
        self.height = height
        pass
    def __call__(self, input, target):
        # do something to both images
        input =  Resize(self.height, Image.BILINEAR)(input)
        target = Resize(self.height, Image.NEAREST)(target)

        if(self.augment):
            # Random hflip
            hflip = random.random()
            if (hflip < 0.5):
                input = input.transpose(Image.FLIP_LEFT_RIGHT)
                target = target.transpose(Image.FLIP_LEFT_RIGHT)
            
            #Random translation 0-2 pixels (fill rest with padding
            transX = random.randint(-2, 2) 
            transY = random.randint(-2, 2)

            input = ImageOps.expand(input, border=(transX,transY,0,0), fill=0)
            target = ImageOps.expand(target, border=(transX,transY,0,0), fill=255) #pad label filling with 255
            input = input.crop((0, 0, input.size[0]-transX, input.size[1]-transY))
            target = target.crop((0, 0, target.size[0]-transX, target.size[1]-transY))   

        input = ToTensor()(input)
        if (self.enc):
            target = Resize(int(self.height/8), Image.NEAREST)(target)
        target = ToLabel()(target)
        target = Relabel(255, 19)(target)

        return input, target

ptrblck · August 30, 2018, 12:53pm

Could you resize it to the desired shape?

target = Resize((64, 86), Image.NEAREST)(target)

mhusseinsh · August 30, 2018, 12:54pm

I received this error


Traceback (most recent call last):
  File "main.py", line 512, in <module>
    main(parser.parse_args())
  File "main.py", line 466, in main
    model = train(args, model, True) #Train encoder
  File "main.py", line 215, in train
    for step, (images, labels) in enumerate(loader):
  File "/home/.local/lib/python2.7/site-packages/torch/utils/data/dataloader.py", line 286, in __next__
    return self._process_next_batch(batch)
  File "/home/.local/lib/python2.7/site-packages/torch/utils/data/dataloader.py", line 307, in _process_next_batch
    raise batch.exc_type(batch.exc_msg)
ValueError: Traceback (most recent call last):
  File "/home/.local/lib/python2.7/site-packages/torch/utils/data/dataloader.py", line 57, in _worker_loop
    samples = collate_fn([dataset[i] for i in batch_indices])
  File "/erfnet_pytorch-master/train/dataset.py", line 95, in __getitem__
    image, label = self.co_transform(image, label)
  File "main.py", line 68, in __call__
    target = Resize(64, 86)(target)
  File "build/bdist.linux-x86_64/egg/torchvision/transforms/transforms.py", line 175, in __call__
    return F.resize(img, self.size, self.interpolation)
  File "build/bdist.linux-x86_64/egg/torchvision/transforms/functional.py", line 204, in resize
    return img.resize((ow, oh), interpolation)
  File "/app/anaconda2/envs/tf-1.2/lib/python2.7/site-packages/PIL/Image.py", line 1695, in resize
    raise ValueError("unknown resampling filter")
ValueError: unknown resampling filter

ptrblck · August 30, 2018, 12:56pm

Did you pass the size as a tuple, i.e. (64, 86)?

mhusseinsh · August 30, 2018, 1:01pm

Ah okay, I did it now as a tuple and it passed
but I got another error


main.py:292: UserWarning: volatile was removed and now has no effect. Use `with torch.no_grad():` instead.
  inputs = Variable(images, volatile=True)    #volatile flag makes it free backward or outputs for eval
main.py:293: UserWarning: volatile was removed and now has no effect. Use `with torch.no_grad():` instead.
  targets = Variable(labels, volatile=True)
main.py:297: UserWarning: invalid index of a 0-dim tensor. This will be an error in PyTorch 0.5. Use tensor.item() to convert a 0-dim tensor to a Python number
  epoch_loss_val.append(loss.data[0])
Exception NameError: "global name 'FileNotFoundError' is not defined" in <bound method _DataLoaderIter.__del__ of <torch.utils.data.dataloader._DataLoaderIter object at 0x7f877c13ef10>> ignored
Traceback (most recent call last):
  File "main.py", line 507, in <module>
    main(parser.parse_args())
  File "main.py", line 461, in main
    model = train(args, model, True) #Train encoder
  File "main.py", line 304, in train
    iouEvalVal.addBatch(outputs.max(1)[1].unsqueeze(1).data, targets.data)
  File "/erfnet_pytorch-master/train/iouEval.py", line 61, in addBatch
    tp = torch.sum(torch.sum(torch.sum(tpmult, dim=0, keepdim=True), dim=2, keepdim=True), dim=3, keepdim=True).squeeze()
RuntimeError: dimension out of range (expected to be in range of [-1, 0], but got 2)

mhusseinsh · August 30, 2018, 1:05pm

However, in this epoch

tpmult = x_onehot * y_onehot    #times prediction and gt coincide is 1
tp = torch.sum(torch.sum(torch.sum(tpmult, dim=0, keepdim=True), dim=2, keepdim=True), dim=3, keepdim=True).squeeze()

and x_onehot istensor([ 0.], device='cuda:0'), and y_onehot is tensor([ 1.], device='cuda:0')

ptrblck · August 30, 2018, 1:10pm

Both tensors are 1-dim tensors and your “second” torch.sum tries to sum in dim2.

mhusseinsh · August 30, 2018, 1:47pm

I changed it to this
tp = torch.sum(torch.sum(torch.sum(tpmult, dim=0, keepdim=True))).squeeze()

and it is not giving errors, but I am not sure if this is a right thing or not, because anyways it is a cloned project which I am working on … so not sure how will this affects the training or the process

mahnazbanoo · March 2, 2021, 4:52pm

hello in this code i dont understand these lines:
target = ToLabel()(target)
target = Relabel(255, 19)(target)
i want to know about these functions performance:
ToLabel()
Relabel

ptrblck · March 2, 2021, 7:25pm

I don’t know how ToLabel is implemented and you could ask @mhusseinsh if this code could be shared so that you can profile it.