Setting custom kernel for CNN in pytorch

Is there a way to specify our own custom kernel values for a convolution neural network in pytorch? Something like kernel_initialiser in tensorflow? Eg. I want a 3x3 kernel in nn.Conv2d with initialization so that it acts as a identity kernel -

0 0 0
0 1 0
0 0 0

(this will effectively return the same output as my input in the very first iteration)

My non-exhaustive research on the subject -

I could use nn.init but it only has some pre-defined kernel initialisaition values.

I tried to follow the discussion on this thread but it doesn’t suit my needs.

I might have missed something in my research please feel free to point out.

I have asked the same on SO here but couldn’t find any answer.


I think the easiest way would be to use the functional API.
You would have to define the weights and use F.conv2d to apply the convolution.
Here is a small example:

nb_channels = 1
h, w = 5, 5
x = torch.randn(1, nb_channels, h, w)
weights = torch.tensor([[0., 0., 0.],
                        [0., 1., 0.],
                        [0., 0., 0.]])
weights = weights.view(1, 1, 3, 3).repeat(1, nb_channels, 1, 1)

output = F.conv2d(x, weights)

Thank you for your response.

I already know about F.conv2d but I wanted to use kernels not just for convolution but for CNN (nn.Conv2d) where learning of weights take place.

I don’t think F.conv2d will help. :frowning:

You would need to set requires_grad=True for the weights and it would also work as nn.Conv2d internally just calls the functional API, see here. :wink:

However, if you prefer to use the module, you could try the following code:

weights = ...
conv = nn.Conv2d(nb_channels, 1, 3, bias=False)
with torch.no_grad():
    conv.weight = nn.Parameter(weights)

output = conv(x)

That makes sense, thank you. Let me try it out. :slight_smile:

I want to cast the data once they completed dot product before addition in f.conv2d. Can you please give me any suggestion ?

As far as I understand you would like to split the convolution operation and add a custom op before the summation.
If so, I think you would need to implement the conv op manually using e.g. unfold.

How can I see the algorithmic implementation of convolution using pytorch?

aten/src/ATen/native/Convolution.cpp might be a good starting point to see which algorithms are being dispatched.

So, how does it work when I have multiple channels?
For example, the below code will output tensors of shape (B, 2, W, H). Instead, I would like the two kernels to be each applied to each channel with output image (B, 2C, W, H)

def get_grad_kernel(channels):
  Iy = [[ 1, 2, 1],
        [ 0, 0, 0],
  Ix = [[-1, 0, 1],
        [-2, 0, 2],
        [-1, 0, 1]]
  return torch.Tensor([ [Ix]*channels, [Iy]*channels ])/4

grad_kernel = get_grad_kernel(channels=3)
compute_gradient = lambda image: F.conv2d(input=image, weight=grad_kernel, padding=1)

EDIT: From ptrblck’s reply, making the kernel as the following

  return torch.Tensor([ [Ix], [Iy] ]*channels)/4

and setting groups=image.size(1) works nicely

Setting groups=in_channels might work for your use case.

1 Like

Thanks for your informative answers.
I am struggling to add more than one weight (let’s say 2 kernels 3x3) in conv2d() to get more than one output at the same time (nn.Conv2d(nb_channels, 2, 3, bias=False).
Could you please help me out with this

The number of kernels is defined in dim0 in the weight matrix and you could use this modified code for it:

nb_channels = 1
h, w = 5, 5
x = torch.randn(1, nb_channels, h, w)
weights = torch.tensor([[[0., 0., 0.],
                         [0., 1., 0.],
                         [0., 0., 0.]],
                        [[0., 0., 0.],
                         [0., 1., 0.],
                         [0., 0., 0.]]])
weights = weights.view(2, 1, 3, 3)

output = F.conv2d(x, weights)

Let me know, if this would work for you or if you get stuck somewhere. :slight_smile:


Thanks a lot for your answer.
This works out for me but I want to implement the discussed Conv2D weights into this simple network:

class CNN(nn.Module):

def __init__(self):
    super(CNN, self).__init__()
    self.features =nn.Conv2d(1, 2, kernel_size=3, stride=1, padding=0)
                           nn.MaxPool2d(kernel_size=2, stride=1)
    self.drop_out = torch.nn.Dropout(0.6)

def forward(self, x):
    return x


I would be appreciated if give me a hint on this as well

You can assign a new weight parameter to self.features via:

self.features = nn.Conv2d(...)
with torch.no_grad():
    weights = torch.tensor(...)
    self.features.weight = nn.Parameter(weights)

Thank you for your answer

Is there a way to adapt this approach so that rather than performing convolutions using different kernels per channel, it performs convolutions on all channels for a tensor in a batch but the kernel for each tensor in the batch changes? So each batch item undergoes a convolution with a unique prespecified kernel.

e.g. something like this:

import torch 

batch_size = 8
channels = 10
img_size = 30
kernel_size = 3

batch = torch.rand((batch_size,channels,img_size,img_size))

# Make a unique kernel for each batch member but the kernel is convolved 
# with every channel 
weights = torch.rand((batch_size,1,kernel_size,kernel_size)).repeat(1,channels,1,1)

conv = torch.nn.Conv2d(channels,channels,kernel_size,padding=4,bias=False)

with torch.no_grad():
    conv.weight = torch.nn.Parameter(weights,requires_grad=False)

output = conv(batch)

Edit: This has been solved using for loops or groups here

Hello @ptrblck I try this code but I got error "Illegal instruction (core dumped)

I’m not sure which code you were trying. Could you post more information about your use case, setup (installed PyTorch version, CUDA etc.) and could you also try to get a backtrace from gdb?