PyTorch nn.Conv2d output comptation

grid_world · February 9, 2021, 4:35pm

I am using Python 3.8 and PyTorch 1.7.1. I saw a code which defines a Conv2d layer as follows:

Conv2d(3, 6, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)

The input ‘X’ being passed to it is a 4D tensor-

X.shape
# torch.Size([4, 3, 6, 6])

The output volume for this conv layer is:

c1(X).shape
# torch.Size([4, 6, 3, 3])

I am trying to use the formula to compute output spatial dimensions for any conv layer: O = ((W - K + 2P)/S) + 1, where W = spatial dimension of image, K = filter/kernel size, P = zero padding & S = stride.

For ‘c1’ conv layer, we get, W = 6, K = 3, S = 2 & P = 1. Using the formula, you get O = ((6 - 3 + (2 x 1)) / 2) + 1 = 5/2 + 1 = 3.5.

The output volume: (4, 6, 3, 3) since number of filters used = 6. How is the spatial output from ‘c1’ then (3, 3)? What am I not getting?

Thanks!

ptrblck · February 10, 2021, 8:16am

The formula to calculate the output shape is given in the docs.
Your posted formula is missing the dilation and the subtraction of the constant 1s.
Also, note that the floor operation is used as indicated by the brackets.

Codefresher · February 10, 2021, 2:03pm

Actually, after the operation of divide is a ceil function to make sure the dim of each shape is a integer when computing in our CPU/GPU… so it’s not 5/2 + 1=3.5, but ceil(5/2)+1=3…the same as the output