RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 1. Got 238 and 57 in dimension 2 at /opt/conda/conda-bld/pytorch_1573049304260/work/aten/src/THC/generic/

                         Dear programmers,

I am trying to define a simplified version of 3D UNet from the original one. In effect, when I tried to run the original UNet architecture, the memory is not enough. Thus, I have reduced the layers and filter sizes. However, when I try to train the model, the folowing error occurs.

File "/home/gaofei/newResearch/codes2/test1/project_fei-master/code/networks/", line 192, in forward
    up3 = self.conv_128_128_UpConv(block2, block1)
  File "/home/gaofei/anaconda3/lib/python3.6/site-packages/torch/nn/modules/", line 541, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/gaofei/newResearch/codes2/test1/project_fei-master/code/networks/", line 140, in forward
    out =, up), 1)
RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 1. Got 238 and 57 in dimension 2 at /opt/conda/conda-bld/pytorch_1573049304260/work/aten/src/THC/generic/

I’m using a GPU with 10.92 GiB total capacity and my UNet codes are as follows

class Convolution(nn.Module):
    def __init__(self, in_channels, out_channels):
        super(Convolution, self).__init__()

        self.convolution = nn.Conv3d(in_channels, out_channels, kernel_size=3)
        self.batch = nn.BatchNorm3d(out_channels)

    def forward(self, x):
        out = F.relu(self.batch(self.convolution(x)))

        return out

class UpConvolution(nn.Module):
    def __init__(self, in_channels, out_channels):
        super(UpConvolution, self).__init__()

        self.up_convolution = nn.ConvTranspose3d(in_channels, out_channels,
                                                 kernel_size=2, stride=2)

    #Center crop
    def crop(self, bridge, up):
        batch_size, n_channels, depth, layer_width, layer_height = bridge.size()
        target_batch_size, target_n_channels, target_depth, target_layer_width, target_layer_height = up.size()

        xy = (layer_width - target_layer_width) //2
        zxy = (depth - target_depth) //2
        # Returns a smaller block which is the same size than the block in the up part
        return bridge[:, :, zxy:(zxy + target_depth), xy:(xy + target_layer_width), xy:(xy + target_layer_width)]

    def forward(self, x, bridge):

        up = self.up_convolution(x)
        # Bridge is the opposite block of the up part
        crop1 = self.crop(bridge, up)
        out =, up), 1)

        return out

class UNet(nn.Module):
    def __init__(self):
        super(UNet, self).__init__()

        self.pooling = nn.MaxPool3d(kernel_size=2, stride=1)

        #Down of unet
        self.conv_1_32 = Convolution(1, 8)
        self.conv_32_64 = Convolution(8, 16)
        self.conv_64_64 = Convolution(16, 16)
        self.conv_64_128 = Convolution(16, 32)
        #self.conv_128_128 = Convolution(128, 128)
        #self.conv_128_256 = Convolution(128, 256)
        #self.conv_256_256 = Convolution(256, 256)
        #self.conv_256_512 = Convolution(256, 512)

        #Up of unet
        #self.conv_512_512_UpConv = UpConvolution(512, 512)
        #self.conv_768_256_Conv = Convolution(768, 256)
        #self.conv_256_256_Conv = Convolution(256, 256)
        #self.conv_256_256_UpConv = UpConvolution(256, 256)
        #self.conv_384_128_Conv = Convolution(384, 128)
        #self.conv_128_128_Conv = Convolution(128, 128)
        self.conv_128_128_UpConv = UpConvolution(32, 32)
        self.conv_192_64_Conv = Convolution(48, 16)
        self.conv_64_64_Conv = Convolution(16, 16)
        self.conv_64_1 = nn.Conv3d(16, 1, 1)

    def forward(self, x):
        start = self.conv_1_32(x)
        block1 = self.conv_32_64(start)
        block1_pool = self.pooling(block1)
        block2 = self.conv_64_64(block1_pool)
        block2 = self.conv_64_128(block2)
        #block2_pool = self.pooling(block2)
        #block3 = self.conv_128_128(block2_pool)
        #block3 = self.conv_128_256(block3)
        #block3_pool = self.pooling(block3)
        #block4 = self.conv_256_256(block3_pool)
        #block4 = self.conv_256_512(block4)

        #up1 = self.conv_512_512_UpConv(block4, block3)
        #up1_conv = self.conv_768_256_Conv(up1)
        #up1_conv = self.conv_256_256_Conv(up1_conv)
        #up2 = self.conv_256_256_UpConv(block3, block2)
        #up2_conv = self.conv_384_128_Conv(up2)
        #up2_conv = self.conv_128_128_Conv(up2_conv)
        up3 = self.conv_128_128_UpConv(block2, block1)
        up3_conv = self.conv_192_64_Conv(up3)
        up3_conv = self.conv_64_64_Conv(up3_conv)
        output = self.conv_64_1(up3_conv)

        output = torch.sigmoid(output)

        return output

Please, any suggestions and remarks would be highly appreciated

The bug happens in this line

  up3 = self.conv_128_128_UpConv(block2, block1)

The error points to a size mismatch in dim2 for block2 and block1.
I just skimmed through the code, but maybe you wanted to use block1_pool instead of block1?

Generally, all shapes besides the specified dimension should match, when you are trying to use
In your case, dim1 could differ, but all other dims should have the same shape.

Dear sir, thank you for your reply. I have tried block1_pool and It still did not work. Please, could you check it again?

Really appreciate your time and patience

Could you check the shapes of block2 and block1 and make sure they are equal in all necessary dimensions?
I’ve implemented a simple UNet a while ago here, which could be a good starter code for your model.

The shapes are as follows.
block1.shape = [1, 16, 124, 124, 124]
block2.shape = [1, 32, 119, 119, 119]

Please, how can I modify the codes so as to make them equal in all necessary dimensions?

I have also checked this model as suggested. However, I am also facing some troubles with it.

Please, could we first adjust the shapes of block2 and block1? Then, I may open another topic for the model you recommended.

Thank you very much for your time and patience

You could either check all setups of the convolutions and make sure the depth, height, and width match or force block1 to have the same shape as block2 via F.interpolate:

x = torch.randn([1, 16, 124, 124, 124])
x = F.interpolate(x, size=(119))

Would it not be better to revise the network layers instead? Is it logical to perform such manipulations within the network definition? Sorry if my questions seem silly. I am just a beginner.

Yes, I think it would be better to check the conv / transposed conv settings in the model.
To do so, you could print out the activation shape after each layer and adjust the necessary parameters.

My input is of shapes [1, 1, 128, 128, 128] and [1, 128, 128, 128] for the image and target, respectively. I still could not figure out how to fix it. Please, could you help debug it with some random variables?