Another way to do global average pooling for each feature map is to use torch.mean
as suggested by @Soumith_Chintala, but we need to flatten each feature map into to vector. The following snippet illustrates the idea,
# suppose x is your feature map with size N*C*H*W
x = torch.mean(x.view(x.size(0), x.size(1), -1), dim=2)
# now x is of size N*C
Also you can use adaptive_avg_pool2d
to achieve global average pooling, just set the output size to (1, 1),
import torch.nn.functional as F
x = F.adaptive_avg_pool2d(x, (1, 1))