# How do I create an L2 pooling 2d layer?

In this article here: https://arxiv.org/abs/1511.06394, they describe using L2 pooling layers instead of max pooling or average pooling. The details of their implementation can be found under under 3.1:

I’m having trouble trying to figure out how to translate their equations to PyTorch, and I’m unsure as to how I would create a custom 2d pooling layer as well.

How do I implement this pooling layer in PyTorch?

I have the MaxPooling2d class rewritten like this:

``````import torch
import torch.nn as nn
import torch.nn.functional as F

class MaxPool2d(torch.nn.Module):

super(MaxPool2d, self).__init__()
self.k = _pair(kernel_size)
self.stride = _pair(stride)

def forward(self, x):
x = x.unfold(2, self.k, self.stride).unfold(3, self.k, self.stride)
x = x.contiguous().view(x.size()[:4] + (-1,))
pool, indices = torch.max(x, dim=-1)
return pool
``````

And based on that I create the L2 layer like this:

``````class L2Pool2d(torch.nn.Module):

super(L2Pool2d, self).__init__()
self.k = _pair(kernel_size)
self.stride = _pair(stride)

def l2(self, x, constant=0, epsilon = 1e-6):
pool = torch.sum(x - constant, dim=-1)

def forward(self, x):
x = x.unfold(2, self.k, self.stride).unfold(3, self.k, self.stride)
x = x.contiguous().view(x.size()[:4] + (-1,))
pool = self.l2(x)
return pool
``````
1 Like

I’m also looking for the implementation of this layer in PyTorch. Have you managed to succesfully implement it?

Since the following answer implement L2 Pooling in TF

May be it would be

``````class L2Pool(nn.Module):
def __init__(self, *args, **kwargs):
super().__init__()
self.pool = nn.AvgPool2d(*args, **kwargs)
def forward(self, x):
``````

I AM NOT SURE IT IS RIGHT…

1 Like

Looking at different implementations I found for tensorflow, like this one: tensorflow - How to implement a L2 pooling layer in Keras? - Stack Overflow , I think you are right. I see most people implementing it like this. However, I’m not quite sure it’s the same as the L2 norm as this is described as: The only difference in your (and other implementations I found) is the reduction by division of the number of elements (so taking the mean instead of the sum). Perhaps this reduction does not matter too much though…

Edit:
I figured out that you can use divisor_override = 1 in avgpool() so that you won’t have this issue.

1 Like

If reduction is not needed…
Option 1:

``````class L2Pool(nn.Module):
def __init__(self, *args, **kwargs):
super().__init__()
## DO NOT USE nn.Parameter HERE
self.weight = torch.zeros(kernel_h, kernel_w).float()
def forward(self, x):
``````

Option 2

``````class L2Pool(nn.Module):
def __init__(self, *args, **kwargs):
super().__init__()
self.pool = nn.AvgPool2d(*args, **kwargs)
self.n = kernel_h * kernel_w
def forward(self, x):
``````

Option 3

``````class L2Pool(nn.Module):
def __init__(self, *args, **kwargs):
super().__init__()
kwargs["divisor_override"] = 1  # div 1 instead of kernel size
self.pool = nn.AvgPool2d(*args, **kwargs)
def forward(self, x):
Thanks for the quick replies! Looks good to me now 