How can I train the parameters of a kernel generator, instead of training the kernel values on a CNN with PyTorch?

I have a math function that generates a custom kernel for use in the convolution layer.

This function has 3 parameters ‘a’, ‘b’ and ‘c’, as follow:
kernel = generator(a, b, c)

I would like to build a CNN model that, instead trainingg the kernel matrix values, train just the parameters ‘a’, ‘b’ and ‘c’ that generates the kernel.

How can I do that in PyTorch?

Example:

class MyModel(nn.Module):
    def __init__(self, kernel):
        super(MyModel, self).__init__()
        self.conv1 = nn.Conv2d(1, 1, kernel_size=7, bias=False)
        # Other layers here

        # Generate the custom kernel
        kernel = kernel_generator(a, b, c)

        # Question: how train a, b, c
        # in the gradient descent???

        # Initialize conv1 with custom kernel
        self.conv1.weight = nn.Parameter(kernel)

    def forward(self, x):
        x = self.conv1(x)
        # pass x to other modules
        return x

You should registered a, b, c as nn.Parameter, and then do something like this:

class MyModel(nn.Module):
    def __init__(self, kernel):
        super(MyModel, self).__init__()
        # Other layers here
            
        # define a, b, c
        a, b, c = torch.randn(10), torch.randn(10), torch.randn(10)
        # register them as Parameters
        a, b, c = nn.Parameter(a), nn.Parameter(b), nn.Parameter(c)

    def forward(self, x):
        # Generate the custom kernel
        kernel = kernel_generator(a, b, c)
        x = torch.nn.functional.conv2d(x, kernel)
        # pass x to other modules
        return x

See the documentation for torch.nn.functional.conv2d here: https://pytorch.org/docs/stable/nn.functional.html?highlight=conv2d#torch.nn.functional.conv2d

I registered a,b,c as nn.Parameter and generated kernel using a, b, c with some operation like add, multiply. When I apply some operation to (a, b, c), the kernel becomes tensor, not nn.Parameter. But, F.conv3d can only receive nn.Parameter as weight. I got this error.

x = F.conv3d(x, self.kernel, stride=1, padding=self.padding)
RuntimeError: Cannot insert a Tensor that requires grad as a constant. Consider making it a parameter or input, or detaching the gradient

If I change the kernel to nn.Parameter, it becomes a leaf node and has no grad_fn. (a, b, c) can’t be trained anymore. How can I train the parameter in pytorch?

I use pytorch 1.5, cuda10.1, cudnn7.5

Thank you

This is certainly not what is causing the error, quite likely, a more context would help.

Thank you for reply.
This is my code.

class MyModel(nn.Module):

def __init__(self, in_channels, out_channels, kernel_size, padding=(0, 2, 2), dilation=1, bias=True, groups=1):
    super(MyModel, self).__init__()

    self.in_channels = in_channels
    self.out_channels = out_channels
    self.groups = groups
    self.padding = padding
    self.relu = nn.ReLU()
    self.bn = nn.BatchNorm3d(out_channels, eps=1e-5, momentum=0.1)
    
    # Register them as parameters
    self.a = self.get_param(self.in_channels, self.out_channels, self.groups)
    self.b = self.get_param(self.in_channels, self.out_channels, self.groups)
    self.c = self.get_param(self.in_channels, self.out_channels, self.groups)  

    # Generate the custom kernel
    self.weight = self.get_weight(self.a, self.b, self.c)

def get_param(self, in_channels, out_channels, groups):
    param = torch.zeros([out_channels, in_channels//groups, 1, 1, 1], dtype=torch.float)
    param = param.cuda()
    nn.init.xavier_normal_(param, gain=nn.init.calculate_gain('sigmoid'))
    return nn.Parmeter(param)            

def get_weight(self, a, b, c):
    one = torch.ones([self.out_channels, self.in_channels // self.groups, 1, 1, 1], dtype=torch.float).cuda()
    bias = torch.sigmoid(c) + one
    kernel_x = torch.cat([bias - (torch.sigmoid(a)),
                          bias - 1 / 2 *(torch.sigmoid(a)),
                          bias,
                          bias - 1 / 2 * (torch.sigmoid(a)),
                          bias - (torch.sigmoid(a))], dim=3)
    kernel_x = kernel_x.repeat((1, 1, 1, 1, 5))
    kernel_y = torch.cat([bias - (torch.sigmoid(b)),
                          bias - 1 / 2 * (torch.sigmoid(b)),
                          bias,
                          bias - 1 / 2 * (torch.sigmoid(b)),
                          bias - (torch.sigmoid(b))], dim=4)
    kernel_y = kernel_y.repeat((1, 1, 1, 5, 1))
    kernel = kernel_x + kernel_y
    # kernel has grad_fn(add_backward)
    return kernel

def forward(self, x):
    
    # Error occured
    x = F.conv3d(x, self.weight, padding=self.padding)
    
    x = self.bn(x)
    x = self.relu(x)

I have to train a, b, c parameters of kernel.

My error is
RuntimeError: Cannot insert a Tensor that requires grad as a constant. Consider making it a parameter or input, or detaching the gradient

When I register kernel as nn.Parameter, there is no error. But kernel has no grad_fn, so I cannot train the a, b, c parameters.

And I found that this error comes from https://github.com/pytorch/pytorch/blob/master/torch/csrc/jit/frontend/tracer.cpp#L147
But I have trouble finding the solution.

Best Regards

Ah, you trace the model. I didn’t quite get that in your first post.
The main thing is that you likely want the weight to be recomputed in forward rather than computed once in __init__ and then stored self.weight.

Yes, you are right. Do you have any suggestion? I don’t know where to start. Should I check torch.jit.trace to apply kernel to F.conv3d? or just make custom C++/CUDA extended convolution to avoid problem of tracing?

I would move the line self.weight = ... into forward and change self.weight to weight.

Wow… Wow! Thank you so much. It was really helpful.