Hello. I’m trying to develop a “weighted average pooling” operation. Regular avg pooling takes a patch and gives you the average, but I want this average to be weighted. This can be easily achieved with a convolution by convolving the weight (say, a 3x3 kernel) with the feature maps. However, there is a fundamental difference between convs and pooling operations: the latter is applied batch and channelwise, whereas convs will mix up the channels.
There are two ways to solve this:

Conv(num_fi, num_fi 3, group=num_fi)
so that the convolutions are performed independently. One problem with this approach is that you need to artificially duplicate the kernel size to the number of filters. Thus, at forward time, you cannot receive a different number of filters; so, in practice, you would need one of this convolutions for every feature map you want to pass. 
In order to accept any number of filters at forward time, you could reshape the tensors. Therefore, your convolution would be defined as
Conv(1, 1, 3)
and you reshape the tensor to (B*C, 1, H, W) and later reshape it back to its original size.
The problem with these approaches is that they are not as efficient as pooling, taking twice is time. For this reason, I would like to know if there is a way to tweak Avg Pool to multiply the patch that is averaged with a constant matrix of weights. Any other suggestions on how to tackle this is also appreciated.
Thanks!