New custom convolution

Hi. I am a beginner in Pytorch. I would like to add a new convolutional layer. The new layer will have an extra input called attention layer (Same size as that of the input). Before doing the convolution, the kernels values will be modified using this attention layer and only then the convolution will be taking place. I am a beginner in python and have no idea in implementing the forward as well as the backward function required for this kind of operation. Can somebody help me in making this layer???


You don’t need to write a backward.

You can write a custom nn.Module that inherits from nn.Conv2d.
Then modify the forward to take an extra argument, get the masked weights and call functional.conv2d to evaluate the convolution with these new weights then return that.

1 Like

will you please provide me the useful links???


I’m not sure which link you want?
You look for conv2d in the doc to get all you need.

Here is the code I would do:

import torch
from torch import nn
from torch.nn import functional as F

class MyConv(nn.Conv2d):
    # Assuming you don't want to change the init

    def forward(self, input, mask):
        new_weight = self.weight * mask # or whatever your masking is supposed to do

        return F.conv2d(input, new_weight, self.bias, self.stride,
                        self.padding, self.dilation, self.groups)

i want to change the kernel weights each time i slide my kernel through the feature map. Will this implementation do that or should i use the unfold function for it?

Ho if you don’t want a “regular” convolution then yes you will have to use unfold.

1 Like

when i use the unfold, do i need to define the backward function?

As long as you only use pytorch operators and that you want the actual gradient of what you compute, you don’t need to defined a backward function :slight_smile:

class MyConv(nn.Conv2d):
    # Assuming you don't want to change the init
    def forward(self, input, mask):
        bs = input.shape[0]
        ch = input.shape[1]
        cols = nd2col(input,self.kernel_size, self.dilation, 1, self.stride)
        cols_att = nd2col(mask, self.kernel_size, self.dilation, 1, self.stride) 
        weight_new = torch.einsum('ijklmn,zykl->ijklmn',(cols_att,self.weight))
        output = torch.einsum('ijklmn,ijklmn->ijmn', (cols, weight_new))
        return output

The function nd2col is used instead of unfold function.

def nd2col(input_nd, kernel_size, stride=1, padding=0, output_padding=0, dilation=1, transposed=False,
        - Input: :math:`(N, C, L_{in})`
        - Output: :math:`(N, C, *kernel_size, *L_{out})` where
          :math:`L_{out} = floor((L_{in} + 2 * padding - dilation * (kernel_size - 1) - 1) / stride + 1)` for non-transposed
          :math:`L_{out} = (L_{in} - 1) * stride - 2 * padding + dilation * (kernel_size - 1) + 1 + output_padding` for transposed
    n_dims = len(input_nd.shape[2:])
    kernel_size = (kernel_size,) * n_dims if isinstance(kernel_size, Number) else kernel_size
    stride = (stride,) * n_dims if isinstance(stride, Number) else stride
    padding = (padding,) * n_dims if isinstance(padding, Number) else padding
    output_padding = (output_padding,) * n_dims if isinstance(output_padding, Number) else output_padding
    dilation = (dilation,) * n_dims if isinstance(dilation, Number) else dilation

    if transposed:
        assert n_dims == 2, 'Only 2D is supported for fractional strides.'
        w_one = input_nd.new_ones(1, 1, 1, 1)
        pad = [(k - 1) * d - p for (k, d, p) in zip(kernel_size, dilation, padding)]
        input_nd = F.conv_transpose2d(input_nd, w_one, stride=stride)
        input_nd = F.pad(input_nd, (pad[1], pad[1] + output_padding[1], pad[0], pad[0] + output_padding[0]))
        stride = _pair(1)
        padding = _pair(0)

    (bs, nch), in_sz = input_nd.shape[:2], input_nd.shape[2:]
    out_sz = tuple([((i + 2 * p - d * (k - 1) - 1) // s + 1)
                    for (i, k, d, p, s) in zip(in_sz, kernel_size, dilation, padding, stride)])
    # Use PyINN if possible (about 15% faster) TODO confirm the speed-up
    if n_dims == 2 and dilation == 1 and has_pyinn and torch.cuda.is_available() and use_pyinn_if_possible:
        output = P.im2col(input_nd, kernel_size, stride, padding)
        output = F.unfold(input_nd, kernel_size, dilation, padding, stride)
        out_shape = (bs, nch) + tuple(kernel_size) + out_sz
        output = output.view(*out_shape).contiguous()
    return output

This code is working… Is there any mistake by doing so? Please give ur suggestions

1 Like


PyINN is a bit old (back from v0.2 or so) so it might not work with the latest pytorch :confused:

But otherwise, the code looks good.