Using nn.Parameter() in a function without using detach


I am trying to add a new parameter to a standard Binary classification network. The parameter is used to alter the input in some way before being fed to the first CNN layer. I want this parameter to be learnt during the course of training. Therefore I have used torch.nn.Parameter(torch.randn(())). I believe this would add the parameter to the optimizer so that it gets updated in the backpropagation. The problem I have currently is that, I have to use this parameter in a function which uses cv2 functions. Therefore I used parameter.detach().cpu().numpy(). I observed that the parameter is not being trained at all. I used Netron to visualize the graph and my parameter is missing.

So, the question is, Is there a possible way to use nn.Parameter in a function to manipulate the input (using cv2) without being detached or being removed from the computation graph?

My forward function is as shown below

    def forward(self, x):
        x = self.userfunc(x, fixed_param, self.learnable_param*np.pi)
        x = self.Maxpool(self.Dropout(self.Relu(self.BatchNorm1(self.Conv1(x)))))
        x = self.Maxpool(self.Dropout(self.Relu(self.BatchNorm2(self.Conv2(x)))))
        x = x.view(x.size(0), -1)
        x = self.Relu(self.Linear1(x))
        x = self.Dropout(x)
        x = self.Linear2(x)
        return x

The learnable parameter is defined in the init function as :

self.learnable_param= torch.nn.Parameter(torch.randn(()))

The body of userfunc starts as :

def userfunc(self, x, fixed_param, learnable_param)
        inp= x.detach().cpu().numpy()
        learnable_param= learnable_param.detach().cpu().numpy()
        x_back = someotherfunc(inp, learnable_param)
        return x_back.cuda()

The function someotherfunc has some cv2 operations, therefore I could not directly use the learnable_param.

I have also tried cloning the learnable_param to another variable and use detach on this variable to manipulate the input. But, this is not helping the cause either.

Is there any way around this? Please let me know. Thanks in advance for your help.

Hi @Kreddy17,

One thing that comes to mind as a potential solution would be to write a custom autograd function that takes its forward method as your someotherfunc and the backward method would need to be customly implemented. The backward method would need to have the gradient of d(someotherfunc)/d(x) and d(someotherfunc)/d(learnable_param) as returned variables which you might be able to determine analytically (depending on your cv2 function). And example of a custom autograd function can be found here