# Concatenate two tensors with different sizes

``````      Dear senior programmers,
``````

I have obtained the following network structure by modifying someone’s else network. I have added the dilation keyword so as to obtain dilated convolutional layer. However, given that there were some concatenation in the “forward part” of the network, I have not been able to adjust the output channels and the concatenation properly. Please could anyone explain to me how to fix? The network is as follows.

``````class net(nn.Module):
def __init__(self):
super(net, self).__init__()
self.conv1 = nn.Conv2d(in_channels=3, out_channels=3, kernel_size=1)
self.conv2 = nn.Conv2d(in_channels=3, out_channels=3, kernel_size=3, padding=2, dilation=2)
self.conv3 = nn.Conv2d(in_channels=3, out_channels=3, kernel_size=5, padding=2, dilation=2)
self.conv4 = nn.Conv2d(in_channels=6, out_channels=3, kernel_size=7, padding=3, dilation=2)
self.conv5 = nn.Conv2d(in_channels=12, out_channels=3, kernel_size=3, padding=1)
self.b = 1

def forward(self, x):

x1 = F.relu(self.conv1(x))
#print(x1.shape)
x2 = F.relu(self.conv2(x1))
print(x2.shape)
cat1 = torch.cat((x1, x2), 2)
x3 = F.relu(self.conv3(cat1))
print(x3.shape)
cat2 = torch.cat((x2, x3), 2)
x4 = F.relu(self.conv4(cat2))
cat3 = torch.cat((x1, x2, x3, x4),2)
k = F.relu(self.conv5(cat3))

if k.size() != x.size():
raise Exception("k, haze image are different size!")

output = k * x - k + self.b
return F.relu(output)
``````

The running error is as follows.

``````RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 2. Got 476 and 480 in dimension 3 at /opt/conda/conda-bld/pytorch_1573049304260/work/aten/src/THC/generic/THCTensorMath.cu:71
``````

Please, how can I fix this error? I would also like to know the relationships between input channel, output channel, padding and dilation.

Thank you for your time and patience

Hi @Patrice,
*Rule1: If you want to concatenate the layer, it has to have only 1 dimension that is different from the other (i.e. NxDiff1xHxW and NxDiff2xHxW or NxCxDiff1xW and NxCxDiff2xW, etc)
in your case, what you are trying to do:
suppose our input x: 1x3x28x28

what you are doing is:

• you do convolution x1, the output is 1x3x28x28
• you do convolution x2, the output is 1x3x28x28
• (a) you do concatenation cat1 of x1 and x2 at axis 2, the output is 1x56x28x3 (incorrect)
• you do convolution x3, the output is 1x3x52x24
• (b) you do concatenation cat2 of x2 and x3 at axis 2 again, with this x2 and x3 dimensions, which are 1x56x28x3 (x2) and 1x3x52x24 (x3). These two will violate the *Rule1.

Hence, you need to fix these ones (a and b) to make sure that they conform to *Rule1.
Here is the correct model with dilated convolution that I corrected for you :).

``````import torch
from torch import nn
from torch.nn import functional as F

class net(nn.Module):
def __init__(self):
super(net, self).__init__()
self.conv1 = nn.Conv2d(in_channels=3, out_channels=3, kernel_size=1)
self.conv2 = nn.Conv2d(in_channels=3, out_channels=3, kernel_size=3, padding=2, dilation=2)
self.conv3 = nn.Conv2d(in_channels=6, out_channels=3, kernel_size=5, padding=4, dilation=2)
self.conv4 = nn.Conv2d(in_channels=6, out_channels=3, kernel_size=7, padding=6, dilation=2)
self.conv5 = nn.Conv2d(in_channels=12, out_channels=3, kernel_size=3, padding=1)
self.b = 1

def forward(self, x, debug=False):

x1 = F.relu(self.conv1(x))
if debug: print('x1:', x1.shape)
x2 = F.relu(self.conv2(x1))
if debug: print('x2:', x2.shape)
cat1 = torch.cat((x1, x2), 1)
if debug: print('cat1:', cat1.shape)
x3 = F.relu(self.conv3(cat1))
if debug: print('x3:', x3.shape)
cat2 = torch.cat((x2, x3), 1)
if debug: print('cat2:', cat2.shape)
x4 = F.relu(self.conv4(cat2))
if debug: print('x4:', x4.shape)
cat3 = torch.cat((x1, x2, x3, x4),1)
if debug: print('cat3:', cat3.shape)
k = F.relu(self.conv5(cat3))
if debug: print('k:', k.shape)

if k.size() != x.size():
raise Exception("k, haze image are different size!")

output = k * x - k + self.b
if debug: print('output:', output.shape)
return F.relu(output)

net_instance = net()
b = torch.rand(1, 3, 28, 28)
b = net_instance(b, True)

#OUTPUT:
#x1: torch.Size([1, 3, 28, 28])
#x2: torch.Size([1, 3, 28, 28])
#cat1: torch.Size([1, 6, 28, 28])
#x3: torch.Size([1, 3, 28, 28])
#cat2: torch.Size([1, 6, 28, 28])
#x4: torch.Size([1, 3, 28, 28])
#cat3: torch.Size([1, 12, 28, 28])
#k: torch.Size([1, 3, 28, 28])
#output: torch.Size([1, 3, 28, 28])
``````

Hope it helps, cheers~

Isn’t the cat of `x1` and `x2` at axis 2 --> `1x3x56x28` unless there’s some kind of transpose happening which I don’t see?

By the way, welcome to the community .

Dear Mr. Brilian, I am very grateful for your prompt reply. The code is now running. Please, I would like to get a better picture of convolution, padding and dilation.

after doing convolution x1, the output is 1x3x28x28. I understand this. Then doing conv2 of x1, how did you get 1x3x28x28? Should not it be 1x3x26x26 given that the kernel is 3?

I am sorry if my question might seem silly. I had read about convolution, padding and dilation. However, it seems like I still have not got it clearly.

Yes sir, you are right. cat of `x1` and `x2` at axis 2 should be `1x3x56x28`. Please, could you clarify how the output of conv2 (x1) i.e x2 is obtained?

Please refer to relationship 15 on page 28 of this excellent guide to convolution arithmetic.

Hi @Patrice, some explanation of conv2(x1):

Notes:

Padding: like adding some zero values on the edges of your image, ie. padding 1 in pytorch will change your x1 from 1x3x28x28 to 1x3x30x30 (1 pixel top-down and left-right), etc.

Dilation: is like you are expanding your kernel by 2x2 (like this), ie, you have kernel=3, which is 3x3 kernel. Then you add 2x2 = 5x5 kernel, etc.

• you do convolution with 3x3 kernel on 1x3x28x28, the output will be 1x3x26x26 IF the padding is not added (padding=0) and dilation is 1.
• you do convolution with 3x3 kernel on 1x3x28x28, the output will be 1x3x24x24 IF the padding is not added (padding=0) and dilation is 2.
• you do convolution with 3x3 kernel on 1x3x28x28, the output will be 1x3x26x26 IF the padding is added (padding=1) and dilation is 2.
• you do convolution with 3x3 kernel on 1x3x28x28, the output will be 1x3x28x28 IF the padding is added (padding=2) and dilation is 2.

Thanks for the welcome @harsha_g
Hope it helps, cheers ~

Thank you very much Mr. Brilian. It is much clearer now.

Thank you sir. I have gone through it and it has been very helpful.

good point, it is a typo, thanks for the correction Cool! And happy coding 