Is it possible to implement Spatial Pyramid Pooling (SPP) layer in PyTorch only, without using C/CUDA code?
A SPP layer essentially needs to pool over a variably-sized feature map into a fix-sized feature map. For instance, a SPP layer with a single output size 22 would pool over a 66 feature map by 33 windows, and over a 88 map by 4*4 windows.
Hi like in torch for using spatialpyramidpooling i can directly call (inn.SpatialPyramidPooling({8,8},{4,4},{2,2},{1,1})) can i do it similarily in some way in pytorch ?
This may not have been available at the time of the original discussion, however PyTorch 1.4 has nn.AdaptiveMaxPool2d which is designed to handle the exact use case of variable --> fixed size feature map conversions. You can see my implementation of the entire SPP layer on Github
I am implementing spatial pooling in my network but I still confuse about how to implement this? I thin there is some implementation problem in my class because my class only contain one Conv layer
class DWConv(nn.Module):
def spatial_pyramid_pool(self, previous_conv, previous_conv_size, out_pool_size=[4, 2, 1]):
'''
previous_conv: a tensor vector of previous convolution layer
previous_conv_size: an int vector [height, width] of the matrix features size of previous convolution layer
out_pool_size: a int vector of expected output size of max pooling layer
returns: a tensor vector with shape [1 x n] is the concentration of multi-level pooling
'''
num_sample = previous_conv.shape[0]
for i in range(len(out_pool_size)):
h_wid = int(math.ceil(previous_conv_size[0] / out_pool_size[i]))
w_wid = int(math.ceil(previous_conv_size[1] / out_pool_size[i]))
h_pad = (h_wid * out_pool_size[i] - previous_conv_size[0] + 1) / 2
w_pad = (w_wid * out_pool_size[i] - previous_conv_size[1] + 1) / 2
maxpool = torch.nn.MaxPool2d((h_wid, w_wid), stride=(h_wid, w_wid), padding=(int(h_pad), int(w_pad)))
x = maxpool(previous_conv)
if (i == 0):
spp = x.view(num_sample, -1)
else:
spp = torch.cat((spp, x.view(num_sample, -1)), 1)
return spp
def __init__(self, dim=768):
super(DWConv, self).__init__()
self.dwconv = nn.Conv2d(dim, dim, 3, 1, 1, bias=True, groups=dim)
def forward(self, x, H, W):
B, N, C = x.shape
x = x.transpose(1, 2).view(B, C, H, W)
x = self.dwconv(x)
x = x.flatten(2).transpose(1, 2)
x = self.spatial_pyramid_pool(previous_conv_size=x.shape)
return x