Setting custom kernel for CNN in pytorch

markroxor · October 13, 2018, 11:37am

Is there a way to specify our own custom kernel values for a convolution neural network in pytorch? Something like kernel_initialiser in tensorflow? Eg. I want a 3x3 kernel in nn.Conv2d with initialization so that it acts as a identity kernel -

0 0 0
0 1 0
0 0 0

(this will effectively return the same output as my input in the very first iteration)

My non-exhaustive research on the subject -

I could use nn.init but it only has some pre-defined kernel initialisaition values.

I tried to follow the discussion on this thread but it doesn’t suit my needs.

I might have missed something in my research please feel free to point out.

I have asked the same on SO here but couldn’t find any answer.

ptrblck · October 13, 2018, 11:51am

I think the easiest way would be to use the functional API.
You would have to define the weights and use F.conv2d to apply the convolution.
Here is a small example:

nb_channels = 1
h, w = 5, 5
x = torch.randn(1, nb_channels, h, w)
weights = torch.tensor([[0., 0., 0.],
                        [0., 1., 0.],
                        [0., 0., 0.]])
weights = weights.view(1, 1, 3, 3).repeat(1, nb_channels, 1, 1)

output = F.conv2d(x, weights)

markroxor · October 13, 2018, 11:59am

Thank you for your response.

I already know about F.conv2d but I wanted to use kernels not just for convolution but for CNN (nn.Conv2d) where learning of weights take place.

I don’t think F.conv2d will help.

ptrblck · October 13, 2018, 12:07pm

You would need to set requires_grad=True for the weights and it would also work as nn.Conv2d internally just calls the functional API, see here.

However, if you prefer to use the module, you could try the following code:

weights = ...
conv = nn.Conv2d(nb_channels, 1, 3, bias=False)
with torch.no_grad():
    conv.weight = nn.Parameter(weights)

output = conv(x)
output.mean().backward()
print(conv.weight.grad)

markroxor · October 13, 2018, 12:09pm

That makes sense, thank you. Let me try it out.

vijaytida · June 27, 2019, 5:22am

Hi,
I want to cast the data once they completed dot product before addition in f.conv2d. Can you please give me any suggestion ?

ptrblck · June 27, 2019, 10:23am

As far as I understand you would like to split the convolution operation and add a custom op before the summation.
If so, I think you would need to implement the conv op manually using e.g. unfold.

vijaytida · June 27, 2019, 5:24pm

How can I see the algorithmic implementation of convolution using pytorch?

ptrblck · June 28, 2019, 9:41am

aten/src/ATen/native/Convolution.cpp might be a good starting point to see which algorithms are being dispatched.

Christopher_Aykroyd · December 28, 2019, 6:57pm

So, how does it work when I have multiple channels?
For example, the below code will output tensors of shape (B, 2, W, H). Instead, I would like the two kernels to be each applied to each channel with output image (B, 2C, W, H)

def get_grad_kernel(channels):
  Iy = [[ 1, 2, 1],
        [ 0, 0, 0],
        [-1,-2,-1]]
  Ix = [[-1, 0, 1],
        [-2, 0, 2],
        [-1, 0, 1]]
  return torch.Tensor([ [Ix]*channels, [Iy]*channels ])/4

grad_kernel = get_grad_kernel(channels=3)
compute_gradient = lambda image: F.conv2d(input=image, weight=grad_kernel, padding=1)

EDIT: From ptrblck’s reply, making the kernel as the following

  return torch.Tensor([ [Ix], [Iy] ]*channels)/4

and setting groups=image.size(1) works nicely

ptrblck · December 28, 2019, 7:21pm

Setting groups=in_channels might work for your use case.

Sepehr.nem · July 16, 2020, 12:30pm

Thanks for your informative answers.
I am struggling to add more than one weight (let’s say 2 kernels 3x3) in conv2d() to get more than one output at the same time (nn.Conv2d(nb_channels, 2, 3, bias=False).
Could you please help me out with this

ptrblck · July 16, 2020, 11:30pm

The number of kernels is defined in dim0 in the weight matrix and you could use this modified code for it:

nb_channels = 1
h, w = 5, 5
x = torch.randn(1, nb_channels, h, w)
weights = torch.tensor([[[0., 0., 0.],
                         [0., 1., 0.],
                         [0., 0., 0.]],
                        [[0., 0., 0.],
                         [0., 1., 0.],
                         [0., 0., 0.]]])
weights = weights.view(2, 1, 3, 3)

output = F.conv2d(x, weights)

Let me know, if this would work for you or if you get stuck somewhere.

Sepehr.nem · July 17, 2020, 3:44pm

Thanks a lot for your answer.
This works out for me but I want to implement the discussed Conv2D weights into this simple network:

class CNN(nn.Module):

def __init__(self):
    super(CNN, self).__init__()
    self.features =nn.Conv2d(1, 2, kernel_size=3, stride=1, padding=0)
                           nn.ReLU(),
                           nn.MaxPool2d(kernel_size=2, stride=1)
                                 )
    self.drop_out = torch.nn.Dropout(0.6)

def forward(self, x):
    x=self.features(x)
    x=self.drop_out(x)
    return x

cnn=CNN()

I would be appreciated if give me a hint on this as well
Thanks

ptrblck · July 18, 2020, 2:28am

You can assign a new weight parameter to self.features via:

self.features = nn.Conv2d(...)
with torch.no_grad():
    weights = torch.tensor(...)
    self.features.weight = nn.Parameter(weights)

Sepehr.nem · July 21, 2020, 1:13pm

Thank you for your answer

spacemeerkat · April 16, 2021, 12:47pm

Is there a way to adapt this approach so that rather than performing convolutions using different kernels per channel, it performs convolutions on all channels for a tensor in a batch but the kernel for each tensor in the batch changes? So each batch item undergoes a convolution with a unique prespecified kernel.

e.g. something like this:

import torch 

batch_size = 8
channels = 10
img_size = 30
kernel_size = 3

batch = torch.rand((batch_size,channels,img_size,img_size))

# Make a unique kernel for each batch member but the kernel is convolved 
# with every channel 
weights = torch.rand((batch_size,1,kernel_size,kernel_size)).repeat(1,channels,1,1)
print(weights.shape)

conv = torch.nn.Conv2d(channels,channels,kernel_size,padding=4,bias=False)

with torch.no_grad():
    conv.weight = torch.nn.Parameter(weights,requires_grad=False)

output = conv(batch)
print(output.shape)

Edit: This has been solved using for loops or groups here

Lukman_Hakim · June 9, 2021, 6:30am

Hello @ptrblck I try this code but I got error "Illegal instruction (core dumped)
"

ptrblck · June 9, 2021, 6:40am

I’m not sure which code you were trying. Could you post more information about your use case, setup (installed PyTorch version, CUDA etc.) and could you also try to get a backtrace from gdb?

fevr · September 18, 2022, 12:28pm

Hello, I test the two methods mentioned above

nb_channels = 1
h, w = 5, 5
x = torch.ones(1, nb_channels, h, w)
weights = torch.tensor([[0.,0., 0.],
                        [0, 1., 0],
                        [0., 0., 0.]])
weights = weights.view(1, 1, 3, 3).repeat(1, nb_channels, 1, 1)
output = F.conv2d(x, weights)

output:
tensor([[[[1., 1., 1.],
[1., 1., 1.],
[1., 1., 1.]]]])

class train_model(nn.Module):
    def __init__(self,n_feature,n_out):
        super(train_model, self).__init__()
        self.conv4 = nn.Conv2d(n_feature, n_out, 3, padding=0)
        with torch.no_grad():
            weights = torch.tensor([[0.,0., 0.],
                        [0, 1., 0],
                        [0., 0., 0.]])
            weights = weights.view( 1,1, 3, 3).repeat(1, n_feature, 1, 1)
            self.conv4.weight = nn.Parameter(weights)
        
    def forward(self, x):
        x = self.conv4(x)
        return x

model = train_model(n_feature=1,n_out=1)
x = torch.ones(1, 1,5, 5)
output=model(x)

I thought the two outputs should be the same (?), but the second one is not even constant.
I can’t figure out why