Can we pass images for which height!=width through our CNN for training in pytorch?

spaul13 · April 15, 2020, 1:08am

can we pass images for which height!=width through our CNN in Pytorch?

In CNN, I have convolution, batch-norm, max-pool, relu, and fully connected layers.

My network

self.conv_seqn = nn.Sequential(
nn.Conv2d(in_channels=3, out_channels=32, kernel_size=3, padding=1),
nn.BatchNorm2d(32),
nn.ReLU(inplace=True),
nn.MaxPool2d(kernel_size=2, stride=2),
nn.Conv2d(in_channels=32, out_channels=64, kernel_size=3, padding=1),
nn.ReLU(inplace=True),
nn.MaxPool2d(kernel_size=4, stride=4),
nn.Conv2d(in_channels=64, out_channels=128, kernel_size=3, padding=1),
nn.ReLU(inplace=True),
nn.MaxPool2d(kernel_size=4, stride=4),
)
self.fc_seqn = nn.Sequential(
nn.Linear(1843200, 256),
nn.ReLU(inplace=True),
nn.Linear(256, total_configs)
)

forward()
{
x = self.conv_seqn(x)
x = x.view(x.size(0), -1)
x = self.fc_seqn(x)
return x
}

***If input image of size 3840X1920X3 after applying conv_seqn() it should be of size [1, 128, 120, 60] but I getting the size of [1,128,120,120] (batch size =1 here)

ptrblck · April 15, 2020, 3:43am

Yes, you can pass input tensors with a rectangular spatial size to the model.
Could you post the input shape again by wrapping it into three backticks ``` please?
I’m currently unsure, how 384021603 should be interpreted.

spaul13 · April 15, 2020, 6:07am

@ptrblck thanks for the comment. I have made the changes to the post. The original size of the image is 3840X1920X3.

Can u plz tell me why for me I can’t pass a tensor of rectangular shape through convolution batchnorm maxpool layers?

ptrblck · April 15, 2020, 6:13am

Your used layers expect an input in the shape [batch_size, channel, height, width], so you would have to add a batch dimension at dim0 and permute the input such that the channel dimension is in dim1.

Once you have the right shape, you could directly pass the input to the model.

spaul13 · April 15, 2020, 6:18am

Thanks a lot for the reply. The resolution of the original input image tensor is [1,3,3840,1920] but still after conv_seqn I am getting [1,128,120,120] instead of [1, 128, 120, 60]. Is there any other reasons for not getting the rectangular tensor properly through the convolution block?

ptrblck · April 15, 2020, 6:20am

Your conv_seqn outputs the desired shape for me:

conv_seqn = nn.Sequential(
nn.Conv2d(in_channels=3, out_channels=32, kernel_size=3, padding=1),
nn.BatchNorm2d(32),
nn.ReLU(inplace=True),
nn.MaxPool2d(kernel_size=2, stride=2),
nn.Conv2d(in_channels=32, out_channels=64, kernel_size=3, padding=1),
nn.ReLU(inplace=True),
nn.MaxPool2d(kernel_size=4, stride=4),
nn.Conv2d(in_channels=64, out_channels=128, kernel_size=3, padding=1),
nn.ReLU(inplace=True),
nn.MaxPool2d(kernel_size=4, stride=4),
)

x = torch.randn(1, 3, 3840, 1920)
out = conv_seqn(x)
print(out.shape)