# Loss function applied to first order derivative of conv network

Hi,
I am hitting my head against the wall for a while here… maybe some kind person can help me out?

In a simple example I am trying to compute the second order derivative and put a cost function on it. For example
The cost function is (d(dy/dx)/dx - 0)^2

In the example there is a 1x1 2d conv (filled with 1s and 0 bias) and a 1x1 linear layer (filled with 1s and 0 bias).

I expect to see [16] in both the linear layer and conv layer .grad , but I only see this in the linear layers .grad and the conv layers .grad is None, although as far as I can tell both have been used extremely similarly.

Does anyone know what I am doing wrong here or how to fix it? I would like the cost function applied to the first derivative to backprop and update the .grad of the convolutional layers weights so that I could run gradient descent on the weights to minimize the cost function above.

In a slightly separate question does anyone know where I can find more documentation on autograd.grad and autograd.backward? It seems like the current documentation online is not accurate (only_inputs is deprecated and obsolete), not sure if there is a better version of explanation of the graph vs autograd.grad vs autograd.backward somewhere that I am missing…

Thank you,

Misko

``````import torch

class Model(torch.nn.Module):
def __init__(self):
super(Model, self).__init__()
self.conv=torch.nn.Conv2d(1,1,1)
self.conv.weight.data.fill_(1)
self.conv.bias.data.fill_(0)
self.linear=torch.nn.Linear(1,1)
self.linear.weight.data.fill_(1)
self.linear.bias.data.fill_(0)

def forward(self,xo):
x=self.conv(xo.reshape(1,1,1,1)**2).sum().reshape(1,1) # required otherwise second deriviative is a scalar not depending on other variable
x+=self.linear(xo.reshape(1,1)**2)
y=(l**2).sum()
y.backward()

model=Model()
loss=model(x)
for name,parameter in model.named_parameters():
``````
1 Like

I am still having trouble with this I think I have an easier example… If you use just one linear(1,1) module everything works as expected, but if you use one conv2d on a 1,1 input (equivalent to the linear module) then you get no gradient on the weights in the convolution…

``````import torch
from torch import nn
from torchviz import make_dot, make_dot_from_trace

model = nn.Sequential()

#if its only one conv module in the model
#model.add_module('C0', nn.Conv2d(1,1,1))  #either uncomment this line

#if its just one linear module in the model
model.add_module('W0', nn.Linear(1,1))     #or uncomment this line

def double_backprop(inputs, net):
y = net(x).mean()**2