About maxpooling or other pooling

I got confused when I was trying to use maxpool2d. The input should be (batch_size, channels, height, width), and I thought the pooling kernel is sliding over (channel, height, width), which means it will find a max value to represent the entities of (channel, height, width) range. To be more specific, when I use a (20, 1, 1) kernel (say I have 20 channels in my input data and batch size is 1), and the output should be (1, 1, height, width). But when I look into the pytorch docs, the kernel doesn’t slide over the channels dimension. So I am really confused. I will appreciate it if anyone could answer my question.

Maxpool2d never considers the channel dimension in the Max operation as said in the docs. Instead it goes over each channel in each sample in the mini batch. Thus, For an input of (N, C, H, W), maxpool2d will give you (N, C, Hout, Wout), where Hout, Wout are calculated from the parameters to the maxpool2d function. Well it did say 2D in the name, didn’t it.

If you can be a bit more specific in what you are want to do, someone might be able to help you out.

If you want to do over channels as well, maybe you can try maxpool3d with an added dummy dimension.

Or if it is going to be only max over channel dimension only, you can also try using max operation along that dimension. But you won’t be able to backpropagate as max is non differentiable.


1 Like

Thank you for your reply. I was trying to deduce the dimension of the tensor after conv2d_layer. I mean (N, C, H, W) to (N, 1, H, W). H and W do not change. So is it possible?

I used maxpool3d, and it worked, thank you again

@ImgPrcSng why do you say we can’t backprop through max? (agreed it’s non-differentiable but so is relu and many other examples), can’t it backprop just like it did for max pooling?

I meant that the operation torch.max is non-differentiable. If you can rewrite it as a nn.MaxPool operation, then it will be.

ummm, I tried with torch.max, it works with that as well

My apologies. I assumed that when an Op returns a tensor, it won’t be differentiable (We need a variable right?)

Maybe in the new update, where they merged tensors and variable, it behaves as you expect. I am not entirely sure.

no no, I did pass variable to torch.max(var, dim), but yeah I think in new version they merged variable and tensor