Questions about global average pooling

I have some questions regarding the use of the adaptive average pooling instead of a concatenate. The questions comes from two threads on the forum


Q1: What is the preferred approach to using global average pooling for current sota models, should there be a fully connected layer after it or have it be a fully convolutional network?

Q2: How do I change the output size to be size k? Do I need to have a conv2d layer before it? From the first forum thread it seems like I need have a layer with k out_channels before the

self.conv2d_last = nn.Conv2d( in_channel, out_channel = k, kernel_size=1)

then in the forward pass I have

x = torch.cat() # if i want to concatenate outputs from different conv layers
x = self.con2d_last
x = F.adaptive_avg_pool2d(x, (1, 1))

Will that get me to a vector of size k that I can use as the output (or will I need to flatten it)? Is this right?

I use fully connected layer,

    self.out1 = nn.Sequential(
        nn.AvgPool2d(4) # where 4 is kernel size

    )





    x = self.conv4(x)
    x = self.out1(x)
    x = x.view(-1, 1024*1*1)

Look at Adaptive average pooling.

Credits : https://github.com/pytorch/vision/issues/538

Hth!

So for the adaptive pooling with output size (1, 1), the input is batch x channel x H x W and the output is batch x channel x 1 x 1?

1 Like

Yeah it will.
You can also use convolutions instead of max pooling, you’ll have to manually create Conv2d later and the fully connected layer instead of average pooling. (As they say then entire network is hyperparameter)