Does anybody know how to using avg pooling function calculating a [14,14,2048]tensor to a 2048 vector?Each value in the vector is a mean of 14X14.
I am a newbie using pytorch and I have wrote my own function in python ,but it is inefficient.
so if you input is x
, which is a 4-dimensional tensor of size [batch_size, 2048, 14, 14]
where 2048
is the number of channels (or feature maps), then you can apply the AvgPooling as follows:
out = torch.nn.functional.avg_pool2d(x, kernel_size=14)
this will result in output of size [batch_size, 2048, 1, 1]
, but you need to reshape it to get [batch_size, 2048]
:
out = out.reshape(-1, 2048)
This is called Global Average Pooling.
A full example is given below:
>>> x = torch.randn(32, 2048, 14, 14)
>>> x.shape
torch.Size([32, 2048, 14, 14])
>>> out = F.avg_pool2d(x, kernel_size=14)
>>> out.shape
torch.Size([32, 2048, 1, 1])
Thanks for your answer!
But is there any other way to do this?
Because the tensor size is torch.Size([14, 14, 2048]) in my dataset.This is my main problem.
Sure, you can swap the axes to get the desired shape using .permute()
>>> a =torch.randn(32, 14, 14, 2048)
>>> a.shape
torch.Size([32, 14, 14, 2048])
>>> a = a.permute(0, 3, 1, 2)
>>> a.shape
torch.Size([32, 2048, 14, 14])
Once you have the tensor in this correct shape, then you can apply avg_pool2d
as I said previously.
Furthermore, if you start from a tensor of size [14, 14, 2048]
, you need to add an extra dimension by calling .unsqueeze()
as follows:
>>> a =torch.randn(14, 14, 2048)
>>> a.shape
torch.Size([14, 14, 2048])
>>> a = a.unsqueeze(dim=0)
>>> a.shape
torch.Size([1, 14, 14, 2048])
This will assume that we have a batch of size 1 (first dimension). Now you can permute the axes of this tensor and then pass the final tensor through avg_pool2d
:
>>> a = a.permute(0, 3, 1, 2)
>>> a.shape
torch.Size([1, 2048, 14, 14])
>>> b = F.avg_pool2d(a, kernel_size=14)
>>> b.shape
torch.Size([1, 2048, 1, 1])
>>> b = b.squeeze()
>>> b.shape
torch.Size([2048])