PyTorch nn.Conv2d output comptation

I am using Python 3.8 and PyTorch 1.7.1. I saw a code which defines a Conv2d layer as follows:

Conv2d(3, 6, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)

The input ‘X’ being passed to it is a 4D tensor-

# torch.Size([4, 3, 6, 6])

The output volume for this conv layer is:

# torch.Size([4, 6, 3, 3])

I am trying to use the formula to compute output spatial dimensions for any conv layer: O = ((W - K + 2P)/S) + 1, where W = spatial dimension of image, K = filter/kernel size, P = zero padding & S = stride.

For ‘c1’ conv layer, we get, W = 6, K = 3, S = 2 & P = 1. Using the formula, you get O = ((6 - 3 + (2 x 1)) / 2) + 1 = 5/2 + 1 = 3.5.

The output volume: (4, 6, 3, 3) since number of filters used = 6. How is the spatial output from ‘c1’ then (3, 3)? What am I not getting?


The formula to calculate the output shape is given in the docs.
Your posted formula is missing the dilation and the subtraction of the constant 1s.
Also, note that the floor operation is used as indicated by the brackets.

1 Like

Actually, after the operation of divide is a ceil function to make sure the dim of each shape is a integer when computing in our CPU/GPU… so it’s not 5/2 + 1=3.5, but ceil(5/2)+1=3…the same as the output

1 Like