Hi.
I have 3-dimensional input tensor with size (1,128, 100) when the agent selects the action and (batch_size, 128, 100) when the agent trains. The input is a sequence of words that tokenized and get vector for every token from Word2Vec model and concatenate to a tensor. So 128 is the number of tokens and 100 is W2V vector size. In this convolutional network:

RuntimeError: Expected 4-dimensional input for 4-dimensional weight [32, 1, 2, 2], but got 3-dimensional input of size [1, 128, 100] instead

Also, I am confused about some parameter values. Is in_channel=1 because of input type, is correct? Please guide me how fix this error.
Thanks in Advance

nn.Conv2d layers expect a 4-dimensional input tensor in the shape [batch_size, channels, height, width]. Based on your error and description I guess the channel dimension is missing, so you could add it via x = x.unsqueeze(1) before passing the tensor to the model.

@ptrblck, the dimension of tensor that I want to pass to CNN: is [32, 3, 512, 512]. It has 32 slices of one image, each of which has three chanels. However CNN expects 4 dimensional input tensor [B,C,H,W]. How can change my tensor [32, 3, 512, 512] to get passed as 4D following the expected input order [B,C,H,W].

No, they do not have to be considered as a separate sample, rather all of them have to be considered as 1 sample. thatâ€™s why I want to squish the first two dimensions to make it compliant with the expected input order [B,C,H,W] .

Since youâ€™ve mentioned â€śslicesâ€ť I would guess you want to treat this dimension as the â€śdepthâ€ť then?
If so, you should use a 3D model and pass the input as [batch_size, channels, depth, height, width] via:

x = torch.randn(32, 3, 512, 512)
x = x.permute(1, 0, 2, 3).contiguous().unsqueeze(0)
print(x.shape)
# > torch.Size([1, 3, 32, 512, 512])

Yes, I will be using 3D CNN later, but at the moment I want to run resnet as a baseline and have 32 slices per sample. [32,3,512,512] is the tensor dimension, I want to squish the first two dimensions to make it a 3D tensor, so that I can pass it as [B,C,H,W] in the network.

Iâ€™m not sure how the description fits the shape, but assuming you want to move the sliced into the channel dimension, you could use x = x.view(-1, 512, 512).unsqueeze(0) to get a tensor of [1, 96, 512, 512] which would then of course not work anymore in a standard ResNet model since 3 input chanels are expected. If this is your use case, you could replace the first conv layer with a new one accepting 96 channels.