Why am I receiving the following error from Conv2D in the pytorch library?

I have the following issue with Conv2D in pytorch, I am trying to feed in a tensor like this: “in_shape = (64,3, 200, 220)” and I am getting error telling me that it expects this shape: “[2, 64, 3, 200, 220]”. Why would it like to see an extra dimension with 2? Any ideas?

I would say that pytorch is not saying that :slight_smile:

Could you paste the exact error and the layer you think it comes from?
Pytorch 2d convs work with the format you are describing (B,C,H,W)

I have solved it, I was providing more dimensions that it was needed and for some reason conv2d placed 2 in front of the vector’s shape tuple.

But right now I am having an issue about connecting CNN and GRU together for speaker identification task. Any idea how to do that?

You usually use max pooling and average pooling to obtain a feature vector. Then (either the dimensions match) or you can use a linear layer to go from the N feats from the cnn to M feats of the gru.

Yes I am trying to do that, but I have an error like RNN wants a 3d vector but the CNN gives a 4d one. :frowning:

So you have something like
then you pull and get
then you need to permute (depending on whether u use the batch first flag or not)
Linear layer if necessary

I’m doing something like this:

from torch import nn, optim

Create a sequential model

model = nn.Sequential()

Add convolutional and pooling layers

model.add_module(‘Conv_1’, nn.Conv2d(in_channels=3, out_channels=64, kernel_size=(4,4)))

model.add_module(‘LRelu_1’, nn.LeakyReLU())

model.add_module(‘BatchN_1’, nn.BatchNorm2d(64))

model.add_module(‘AvgPool_1’, nn.AvgPool2d(kernel_size=4))

model.add_module(‘BatchN_2’, nn.BatchNorm2d(64))

model.add_module(‘Conv_2’, nn.Conv2d(in_channels=64, out_channels=128, kernel_size=(4,4)))

model.add_module(‘LRelu_2’, nn.LeakyReLU())

model.add_module(‘BatchN_3’, nn.BatchNorm2d(128))

model.add_module(‘AvgPool_2’, nn.AvgPool2d(kernel_size=4))

model.add_module(‘BatchN_4’, nn.BatchNorm2d(128))

model.add_module(‘Conv_3’, nn.Conv2d(in_channels=128, out_channels=128, kernel_size=(3,3)))

model.add_module(‘LRelu_3’, nn.LeakyReLU())

model.add_module(‘BatchN_3’, nn.BatchNorm2d(128))

model.add_module(‘AvgPool_3’, nn.AvgPool2d(kernel_size=2))

model.add_module(‘BatchN_5’, nn.BatchNorm2d(128))

model.add_module(‘Conv_4’, nn.Conv2d(in_channels=128, out_channels=64, kernel_size=(2,2)))

model.add_module(‘LRelu_4’, nn.LeakyReLU())

Add a Flatten layer to the model

#model.add_module(‘Flatten’, nn.Flatten())

Add a Linear layer with 64 units and relu activation

model.add_module(‘Linear_1’, nn.Linear(in_features=4, out_features=256, bias=True))

model.add_module(‘LRelu_5’, nn.LeakyReLU())


Add the last Linear layer.

model.add_module(‘Linear_2’, nn.Linear(in_features=256, out_features=183, bias=True))

model.add_module(‘Out_activation’, nn.Softmax(-1)) #??? -1

model = model.to(device)

from torchsummary import summary

in_shape = (3, 200, 220)

summary(model, input_size=(in_shape))

If you could help me that would be fantastic contribution to Pytorch forum, because there is nothing about CNN-GRU’s connection.

Now with this code the error is: RuntimeError : input must have 3 dimensions, got 4