Using mathematical expression over convolution kernels

nclk · September 10, 2019, 3:08pm

Hi,

I want to use some mathematical expression over kernels in convolution. I wrote some codes as a dummy example.

1- My question is this kind of definition is true for gradient flow?, and
2- Should we use super()__init__ definition?
3- Does autograd allow logical constrains like x = x.*(x<5) (matlab-like) in forward?

class new_conv(nn.Conv2d):
    def __init__(self, dim):
        self.W   = Variable(torch.randn(dim, 1).type(dtype), requires_grad=True)
        self.act = F.relu()

    def forward(self, x):
        xMean = x.mean(dim=1, keepdim=True).mean(dim=2, keepdim=True).mean(dim=3, keepdim=True)
        x = x - xMean 
        
        w = self.W**2
        wMean = w.mean(dim=1, keepdim=True).mean(dim=2, keepdim=True).mean(dim=3, keepdim=True)
        w = F.relu(w-wMean)
        x = torch.cat((x,x**2),2)              # Channelwise concatenating 
        
        return self.act(F.conv2d(x, w))

4- Last question, can we express this module as function? In such a definition, every iteration I use new_conv in the forward step, I doubt that weights are reassigned. So even if the expression works, it will be untrainable, as if.

Many thanks…

def new_conv(x,dim):
    W   = Variable(torch.randn(dim, 1).type(dtype), requires_grad=True)
    xMean = x.mean(dim=1, keepdim=True).mean(dim=2, keepdim=True).mean(dim=3, keepdim=True)
    x = x - xMean 
        
    w = W**2
    wMean = w.mean(dim=1, keepdim=True).mean(dim=2, keepdim=True).mean(dim=3, keepdim=True)
    w = F.relu(w-wMean)
    x = torch.cat((x,x**2),2)              # Channelwise concatenating 
        
    return F.relu(F.conv2d(x, w))

albanD · September 10, 2019, 8:45pm

Hi,

Not sure what you refer to in that question.
You should always call super().__init__() if you subclass an nn.Module() and you write a custom __init__() function.
The autograd allows for that. Just note that no gradient (a gradient of 0) will flow back on the side of x<5 because this gives a piece-wise constant output which has a gradient of 0 almost everywhere. In general, the autograd engine will raise an exception if it cannot compute the gradients you ask for.
You want to use nn.Module over just a python function if you have some parameters. Which is your case here actually. Note as well that parameters should be of type nn.Parameter (in your case: self.W = nn.Parameter(torch.randn(dim, 1).type(dtype)) so that it will be properly found as a parameter of your network when you do net.parameters() (for example to give all your parameters to your optimizer).
If you do the function you showed, then it will generate random weights every time and no learning will happen, which is not what you want I guess as you set requires_grad = True here.
Keep in mind that in pytorch, your code is run at every forward pass.

nclk · September 11, 2019, 3:44pm

@albanD Thank you for your answer. it was informative and helpful for me.