Adaptive_avg_pool2d vs avg_pool2d

LMA · October 11, 2018, 2:03am

What is the difference between adaptive_avg_pool2d and avg_pool2d under torch.nn.functional? What does adaptive mean?

Mazhar_Shaikh · October 11, 2018, 7:42am

Hi LMA,
In avg_pool2d, we define a kernel and stride size for the pooling operation, and the function just performs that operation on all valid inputs. For example, an avg_pool2d with kernel=3, stride=2 and padding=0, would reduce a 5x5 tensor to a 3x3 tensor, and a 7x7 tensor to a 4x4 tensor.(HxW)
In adaptive_avg_pool2d, we define the output size we require at the end of the pooling operation, and pytorch infers what pooling parameters to use to do that. For example, an adaptive_avg_pool2d with output size=(3,3) would reduce both a 5x5 and 7x7 tensor to a 3x3 tensor.
This is especially useful if there is some variation in your input size and you are making use of fully connected layers at the top of your CNN.

LMA · October 11, 2018, 2:41pm

Thanks Mazhar! Based on what you said, it seems to me ‘adaptive’ is in the sense of adapting the kernel size and stride and maybe padding to the output size, not in the sense of varying the weights while taking the average. In other words, the average is the plain average (sum divided by the number of elements in the kernel), not a weighted average of elements that fall within the kernel. Is my understanding correct?

Mazhar_Shaikh · October 11, 2018, 4:39pm

That’s correct, LMA, to the best of my knowledge.

n0obcoder · August 26, 2019, 7:27am

I am not sure how nn.AdaptiveAvgPool2d works…
I defined a tensor of shape (1,1,3,3)

inp = torch.tensor([[[[1,2.,3], [4,5,6], [7,8,9]]]], dtype = torch.float)

print(inp .shape)
print(inp)

torch.Size([1, 1, 3, 3])
tensor([[[[1., 2., 3.],
          [4., 5., 6.],
          [7., 8., 9.]]]])

Then i applied AdaptiveAvgPool2d on it and the result was not what i had expected.

out = nn.AdaptiveAvgPool2d((2,2))(inp)
print(out)

tensor([[[[3., 4.],
          [6., 7.]]]])

I thought the result would look like

tensor([[[[5., 6.],
          [8., 9.]]]])

Please correct me understnding how adaptive pooling works. Thanks in advance !

Mazhar_Shaikh · August 26, 2019, 8:41am

Hi n0obcoder,
For the output_size of (2,2) and input size of (3,3), the kernel size would be defined as (2,2). Accordingly,
The output will be

tensor([[[[(1+2+4+5)/4., (2+3+5+6)/4.],     = tensor([[[[3., 4.],
          [(4+5+7+8)/4., (5+6+8+9)/4.]]]])              [6., 7.]]]])

If we were using max pooling, the output would be

tensor([[[[max(1,2,4,5), max(2,3,5,6)],     = tensor([[[[5., 6.],
          [max(4,5,7,8), max(5,6,8,9)]]]])              [8., 9.]]]])

Hope this helps!

n0obcoder · August 26, 2019, 9:47am

my bad. I was using AdaptiveAvgPool2d and expecting it to work like AdaptiveMaxPool2d
Thanks for the reply anyways : D

n0obcoder · August 26, 2019, 9:49am

btw, do u have any idea on how to make the architecture for an autoencoder for varying size input images ?

StannisLaw · November 11, 2021, 6:03pm

Is there a way to replace AdaptiveAvgPooling by AvgPooling, i. e. how to calculate kernel size, stride and padding?

Garima_Jain · June 9, 2022, 7:17am

Have a look at this.