Custom pooling layer

Hello everyone.
I read some similar topics about creating a custom pooling layer. but I could not implement what I want whit torch.unfold.
I wanna create a simple pooling layer that if sum of patch values was greater than the special value(i.e. 5) it returns the sum, otherwise returns 0.

thanks a lot.

Hi Reza!

One approach would be to use something like AvgPool2d, rather than
write a full custom layer:

sum_pool = size_of_patch * AvgPool2d (kernel_size) (input)
threshold_sum_pool = (sum_pool > threshold) * sum_pool

(The purpose of the factor size_of_patch is to convert the average
over the patch to your desired sum.)

Best.

K. Frank

thank you, K. Frank. frankly, I want to implement something like the below, actually applying moor cellular automata to every patch. but I can’t return the true shape.

import torch as th
import torch.nn as thNn
import torch.nn.functional as thFn

import math

def unpack_param_2d(param):
try:
p_H, p_W = param[0], param[1]
except:
p_H, p_W = param, param

return p_H, p_W

def mooreCA_pool_2d(inputs, kernel_size, stride, padding, dilation):

#Input should be 4D (BCHW)
assert(inputs.dim() == 4)    

#Get input dimensions
b_size, c_size, h_size, w_size = inputs.size()

#Get input parameters
k_H, k_W = unpack_param_2d(kernel_size)
s_H, s_W = unpack_param_2d(     stride)
p_H, p_W = unpack_param_2d(    padding)
d_H, d_W = unpack_param_2d(   dilation)

#First we unfold all the (kernel_size x kernel_size)  patches
unf_input = thFn.unfold(inputs, kernel_size, dilation, padding, stride)    
    
for i in range(unf_input.shape[2]):       
    sum_col=unf_input[c_size-1,:,i].sum()        
    t=unf_input[ c_size-1,4,i]        
    if(t==0):
        if(sum_col==5): 
            unf_input[ c_size-1,4,i]=1                
        elif(sum_col==1 or sum_col==2 or sum_col==8):
            unf_input[ c_size-1,4,i]=0            
    else:
        if(sum_col==6): 
            unf_input[ c_size-1,4,i]=1
        elif(sum_col==2 or sum_col==3 or sum_col==9):
            unf_input[ c_size-1,4,i]=0            

out_W = math.floor(((w_size + (2 * p_W) - (d_W * (k_W - 1)) - 1) / s_W) + 1)
out_H = math.floor(((h_size + (2 * p_H) - (d_H * (k_H - 1)) - 1) / s_H) + 1)
output=unf_input.view(b_size, c_size, out_H ,out_W )   #here is my wrong          
return output

for the input img=th.tensor([[0,1,1,1],[0,1,1,1],[0,1,1,1]]).float()
i got the RuntimeError: shape ‘[1, 1, 1, 2]’ is invalid for input of size 18

Hi Reza!

I believe that your error message is saying that unf_input consists of
18 elements, but that you are trying to view it as a shape of [1, 1, 1, 2],
which would only contain 2 elements.

This suggests that your calculation of out_W and / or out_H might be
incorrect.

For debugging, I would suggest that you go through and print out the
values of the various intermediate expressions that go into computing
the dimensions passed to .view(). Do you really expect .view() to
only contain 2 elements?

Best.

K. Frank