Problem in using pooling function

c_dai · December 2, 2018, 5:16am

Does anybody know how to using avg pooling function calculating a [14,14,2048]tensor to a 2048 vector?Each value in the vector is a mean of 14X14.
I am a newbie using pytorch and I have wrote my own function in python ,but it is inefficient.

vmirly1 · December 2, 2018, 5:31am

so if you input is x, which is a 4-dimensional tensor of size [batch_size, 2048, 14, 14] where 2048 is the number of channels (or feature maps), then you can apply the AvgPooling as follows:

out = torch.nn.functional.avg_pool2d(x, kernel_size=14)

this will result in output of size [batch_size, 2048, 1, 1], but you need to reshape it to get [batch_size, 2048]:

out = out.reshape(-1, 2048)

This is called Global Average Pooling.

A full example is given below:

>>> x = torch.randn(32, 2048, 14, 14)
>>> x.shape
torch.Size([32, 2048, 14, 14])
>>> out = F.avg_pool2d(x, kernel_size=14)
>>> out.shape
torch.Size([32, 2048, 1, 1])

c_dai · December 2, 2018, 6:46am

Thanks for your answer！
But is there any other way to do this?
Because the tensor size is torch.Size([14, 14, 2048]) in my dataset.This is my main problem.

vmirly1 · December 2, 2018, 6:59am

Sure, you can swap the axes to get the desired shape using .permute()

>>> a =torch.randn(32, 14, 14, 2048)
>>> a.shape
torch.Size([32, 14, 14, 2048])

>>> a = a.permute(0, 3, 1, 2)
>>> a.shape
torch.Size([32, 2048, 14, 14])

Once you have the tensor in this correct shape, then you can apply avg_pool2d as I said previously.

vmirly1 · December 2, 2018, 2:11pm

Furthermore, if you start from a tensor of size [14, 14, 2048], you need to add an extra dimension by calling .unsqueeze() as follows:

>>> a =torch.randn(14, 14, 2048)
>>> a.shape
torch.Size([14, 14, 2048])
>>> a = a.unsqueeze(dim=0)
>>> a.shape
torch.Size([1, 14, 14, 2048])

This will assume that we have a batch of size 1 (first dimension). Now you can permute the axes of this tensor and then pass the final tensor through avg_pool2d:

>>> a = a.permute(0, 3, 1, 2)
>>> a.shape
torch.Size([1, 2048, 14, 14])
>>> b = F.avg_pool2d(a, kernel_size=14)
>>> b.shape
torch.Size([1, 2048, 1, 1])
>>> b = b.squeeze()
>>> b.shape
torch.Size([2048])