Missing the batch dimension as well as the channels [Error: Expected 4-dimensional input for 4-dimensional weight 32 1 5 5, but got 2-dimensional input of size [2048, 2048]]

Hi, I’m trying to implement a CNN model as below:

However, I got the error [Expected 4-dimensional input for 4-dimensional weight 32 1 5 5, but got 2-dimensional input of size [2048, 2048] and I don’t know how to solve the problem, Can anybody help me? Thạnk you really much !!!

My CNN code:

    MLP_modules = []
MLP_modules.append(nn.Conv2d(1, 32 , kernel_size =  5))
MLP_modules.append(nn.ReLU())
MLP_modules.append(nn.Conv2d(32, 1, kernel_size = 3))
MLP_modules.append(nn.ReLU())
MLP_modules.append(nn.Conv2d(32, 1, kernel_size = 2))
MLP_modules.append(nn.ReLU())
self.MLP_layers = nn.Sequential(*MLP_modules) 

forward() func:

embed_user_MLP = self.embed_user_GMF(user)
embed_item_MLP = self.embed_item_GMF(item)
interaction_matrix = torch.ger(embed_user_MLP.reshape(-1), embed_item_MLP.reshape(-1))
output_MCLP = self.MLP_layers(interaction_matrix)

Data of User embedding outer product item embedding

Based on the error message, it seems the input to the first conv layer is given as a 2-dimensional tensor.
This is probably your embedding.
An nn.Conv2d layer expects a 4-dimensional input with the shapes [batch_size, channels, height, width].
Based on the setup of your first conv layer and your drawing, I assume you would like to use a single channel with a spatial size of [8, 8].

If that’s the case, try to unsqueeze the batch and channel dimension using:

# x.shape == [8, 8]
x = x.unsqueeze(0).unsqueeze(1)
# x.shape == [1, 1, 8, 8]

However, could you also post the first code snippet including the embedding, as apparently your batch dimension went missing, which could be another (hidden) error in your code.

I use the embedding layer of torch like this:

self.embed_user_GMF = nn.Embedding(user_num, factor_num)
self.embed_item_GMF = nn.Embedding(item_num, factor_num)

with factor_num = 8.

I tried your solution but I got the new error: Given groups=1, weight of size 1 32 2 2, expected input[1, 1, 2042, 2042] to have 32 channels, but got 1 channels instead

My code:
embed_user_MLP = self.embed_user_GMF(user)
embed_item_MLP = self.embed_item_GMF(item)
interaction_matrix = torch.ger(embed_user_MLP.reshape(-1), embed_item_MLP.reshape(-1))
interaction_matrix = interaction_matrix.unsqueeze(0).unsqueeze(1)
output_MCLP = self.MLP_layers(interaction_matrix)

Thanks for the code.
I’m a bit confused about the torch.ger call.
You are currently reshaping the output of both embedding layers to a 1-dimensional tensor.
This would also mean that you are collapsing all dimensions, including the batch dimension.
Is this what you plan to do?

Usually, you would keep the batch dimension intact, so could you explain the use case a bit, so that we could debug a bit more?

Thank you for your response.

In theory, I have a 1D vector (tensor) of user embedding and a 1D vector item embedding. Then I want to make an outer product for these two 1D vectors to obtain a 2D matrix (considered as an image) - This matrix will be the input of CNN model.

I reshaped the user & item tensors because I see that torch.ger required 1D tensor input, i’m not sure it is my case or not :frowning:

Your approach might be right for your use case, but you could also calculate the outer product of batched tensors by using:

batch_size = 10
m, n = 3, 4
a = torch.randn(batch_size, m)
b = torch.randn(batch_size, n)

result = a.unsqueeze(2) * b.unsqueeze(1)
print(result.shape)
> torch.Size([10, 3, 4])

Thank you really really much for your response, @ptrblck