How to match sizes of Tensors for concatenation

exponential · July 12, 2022, 6:31pm

Hello,
I am trying to modify densenet for my use case by taking the output of a denseblock and concat. it with another while retaining the information. See the code below:

class DenseNet(nn.Module):
    def __init__(self, growthRate, depth, reduction, bottleneck):
        super(DenseNet, self).__init__()

        nDenseBlocks = (depth - 4) // 3
        if bottleneck:
            nDenseBlocks //= 2

        nChannels = 2 * growthRate
        self.conv1 = nn.Conv2d(3, nChannels, kernel_size=3, padding=1,
                               bias=False)
        self.dense1 = self._make_dense(nChannels, growthRate, nDenseBlocks, bottleneck)
        nChannels += nDenseBlocks * growthRate
        nOutChannels = int(math.floor(nChannels * reduction))
        self.trans1 = Transition(nChannels, nOutChannels)

        nChannels = nOutChannels
        self.dense2 = self._make_dense(nChannels, growthRate, nDenseBlocks, bottleneck)
        nChannels += nDenseBlocks * growthRate
        nOutChannels = int(math.floor(nChannels * reduction))
        self.trans2 = Transition(516, nOutChannels)

       nChannels = nOutChannels
        self.dense3 = self._make_dense(nChannels, growthRate, nDenseBlocks, bottleneck)
        nChannels += nDenseBlocks * growthRate
        self.trans3 = Transition(nChannels, nOutChannels)

def _make_dense(self, nChannels, growthRate, nDenseBlocks, bottleneck):
        layers = []
        for i in range(int(nDenseBlocks)):
            if bottleneck:
                layers.append(Bottleneck(nChannels, growthRate))
            else:
                layers.append(SingleLayer(nChannels, growthRate))
            nChannels += growthRate
        return nn.Sequential(*layers)

    def forward(self, x):
        x_shape = x.shape[2:]
        out = self.conv1(x)
        out_a = self.dense1(out)  # A
#         out_a = self.dropout(out_a)
        print(f'Shape of out_a is {out_a.shape}')

        out_b = self.trans1(out_a.clone())
        print(f'Shape of out_b is {out_b.shape}')


        out_c = self.dense2(out_b)
        print(f'Shape of out_c is {out_c.shape}')
        out_d = torch.cat([out_c.clone(), out_a], 1)

        out_e = self.trans2(out_d.clone())
        out_f = self.dense3(out_e)
        out_g = torch.cat([out_f.clone(), out_d], 1)
        return out_g

However, when I tried to print my model summary, feeding it image size 224 x 224 of channel size 3 ==> 3, 224, 224

from torchsummary import summary

summary(model, (3, 224, 224))

I get the following error

Shape of out_a is torch.Size([2, 216, 224, 224])
Shape of out_b is torch.Size([2, 108, 112, 112])
Shape of out_c is torch.Size([2, 300, 112, 112])
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-9-d3385e62fb19> in <module>
      1 from torchsummary import summary
      2 
----> 3 summary(model, (3, 224, 224), 16)

~/opt/anaconda3/envs/ge-mac/lib/python3.6/site-packages/torchsummary/torchsummary.py in summary(model, input_size, batch_size, device)
     70     # make a forward pass
     71     # print(x.shape)
---> 72     model(*x)
     73 
     74     # remove these hooks

~/opt/anaconda3/envs/ge-mac/lib/python3.6/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    530             result = self._slow_forward(*input, **kwargs)
    531         else:
--> 532             result = self.forward(*input, **kwargs)
    533         for hook in self._forward_hooks.values():
    534             hook_result = hook(self, input, result)

<ipython-input-5-aa8ff12b625d> in forward(self, x)
     88         out_c = self.dense2(out_b)
     89         print(f'Shape of out_c is {out_c.shape}')
---> 90         out_d = torch.cat([out_c.clone(), out_a], 1)
     91 
     92 #         out_c_ =  F.interpolate(out_c.clone(), (224,224), mode='bilinear', align_corners=True)

RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 1. Got 112 and 224 in dimension 2 at /Users/distiller/project/conda/conda-bld/pytorch_1579022036889/work/aten/src/TH/generic/THTensor.cpp:612```

After printing out the output shape, I could see there is a size mismatch,

Shape of out_a is torch.Size([2, 216, 224, 224])
Shape of out_b is torch.Size([2, 108, 112, 112])
Shape of out_c is torch.Size([2, 300, 112, 112])```

I thought of upsampling the out_c i.e using F.interpolate(out_x, desired_shape, mode='bilinear', align_corners=True)
but would this retain my information without altering it? if not, any other way to match the size for subsequent concatenation?
Many thanks for the help

ptrblck · July 12, 2022, 10:47pm

Your suggested interpolation will use the bilinear interpolation so would change the activation tensor. It depends on your use case if this would be acceptable or if e.g. a nearest neighbor interpolation would be preferable.

exponential · July 12, 2022, 11:51pm

thank you for your response,

I do not want a change in the activation tensor. I want to concat it as is, the interpolation should not change features. So, does this mean nearest neighbour allows for upsampling without a change in the activation tensors? I can’t seem to find enough documentation on bilinear and nearest interpolation.

ptrblck · July 13, 2022, 2:27am

These lecture notes might help. In summary: nearest neighbor interpolation will reuse the value from its nearest neighbor, while (bi)linear interpolation will interpolate new values using the neighbors and their distance as the weight.

exponential · July 13, 2022, 10:50am

Many thanks ptrblck!
This is explanatory, and the resource shared is useful.