Size Mismatch for Functional Linear Layer


I apologize that this is probably a simple question that has been answered before, but I could not find the answer. I’m attempting to use a CNN to extract features and then input that into a FC network that outputs 2 variables. I’m attempting to use the functional linear layer as a way to dynamically handle the flattened features. The self.cnn is a Sequential container which last layer is the nn.Flatten(). When I print the size of x after the CNN I see it is 15x152064, so I’m unclear why the F.linear layer is failing to run with the error below. Any help would be appreciated.

RuntimeError: size mismatch, get 15, 15x152064,2

x = self.cnn(x)
batch_size, channels = x.size()
x = F.linear(x, torch.Tensor([256,channels]))
y_hat = self.FC(x)


That’s because, torch.Tensor([256, channels]) is just a the vector [256, 152064] which a tensor of shape (2,). I think you wanted to do something like x = F.linear(x, torch.randn(256, channels)) or something equivalent. But I don’t really see what you are trying to do, in fact the last layer you just added will have a predefined weight, and won’t be trainable, so what is the reason you are trying to add it?

Thanks for that, it does help. What I am trying to do is dynamically calculate the output size of the CNN and feed that into the FC network. I know I can print the size of the CNN and set the input size of the first linear layer manually, but I wasn’t sure if there is a more elegant way of dealing with possible changes in architecture of the CNN that would cause the output to vary in size.

Hi @ddicostanzo,

I think this not possible as you want to do it in torch. However the way you are solving the problem, i.e., using CNN feature extractor followed by a Flatten layer and fully connected head is not the best choice. Modern neural networks (in computer vision) rely on a global pooling layer instead. I think you should use a similar approach.

Thanks @omarfoq. I will investigate using the global pooling. Appreciate the help.