Linear layer input neurons number calculation after conv2d

First of all there is a problem with your input shape. The shape should be BATCH_SIZE * CHANNEL * HEIGHT *WIDTH. So lets correct your size and I assume you BATCH_SIZE = 36, CHANNEL = 3, HEIGHT = 200 , WIDTH = 150.

images = image.permute(0,3,1,2)

Next lets change your first Conv2d code. IT should be

torch.nn.Conv2d(3, 64, kernel_size=(3, 3))

So after the first convolution using your formular, we will have

[3, 64, 198, 148]

After the second Conv2d operation, we will have

[3, 128, 196, 146].

The maxpooling which halves the activations we will have

[3, 128, 98, 73]

And finally the input of the fully connected layer will be 128×98×73 = 915712

2 Likes