For anyone who is also interested on how to do it exactly. Suppose the feature map is of size
N*C*H*W, after sum-pooling, it will become tensor of size
N*C (N image, each has a feature vector of dimension C). Here is a code snippet to do it,
# suppose x is the feature map after some layer
x = torch.sum(x.view(x.size(0), x.size(1), -1), dim=2)
The above code flattens the 2nd and 3rd dimension of original tensor to a vector and calculate sum on the newly created tensor on dimension 2, which is exactly what sum-pooling does.