Get gradient with respect to a given layer

Hello! I am sorry if this is a silly question but I am quite new so I would really appreciate any help. How can I take the derivative of the output layer with respect to an intermediate layer? Here is a little example of what i mean (simplified):

class NN(torch.nn.Module):

    def __init__(self):
        super(Model, self).__init__()
        self.l1 = torch.nn.Linear(2, 3)
        self.l2 = torch.nn.Linear(3, 1)
        self.l3 = torch.nn.Linear(1, 3)
        self.l4 = torch.nn.Linear(3, 2)

    def forward(self, x):
        z = torch.sigmoid(self.l1(x))
        z = torch.sigmoid(self.l2(z))
        y_pred = torch.sigmoid(self.l3(z))
        y_pred = self.l4(y_pred)
        return y_pred, z

I also have func(x)=x+1/x

I want to define (minimize) my loss function like this: L = MSE(y,y_pred) + func(d(y_pred)/d(z))

Basically I want to take the derivative of the output with respect to z, which is an inner layer. Can someone help me? Thank you!


The thing is that d(y_pred)/d(z) is a 2D Jacobian matrix. The autograd engine can only compute Jacobian vector product.
Could you give more informations about what func() is? Maybe it is doable in your case.

Thank you for your reply. Actually you can ignore the func(), i realized i am overcomplicating things. Here is what I need. So my output has 2 numbers: y_pred[0] and y_pred[1]. I want my loss to be this: L = MSE(y,y_pred) + abs(sqrt((d(y_pred[0])/d(z))**2 + (d(y_pred[1])/d(z))**2)-1). So I want the norm of my Jacobian to be 1. I tried using this: torch.autograd.grad(output[0][0], output[1], create_graph=True)[0] to get, for example, d(y_pred[0])/d(z). It works for one example, but if I have a batch it doesn’t work anymore (I am not sure how to handle the dimensionality of the tensors so that I can take gradients properly).

Update: I wrote this code, which works:

class SimpleNet(nn.Module):
    def __init__(self):
        self.linear1 = nn.Linear(2, 1,  bias=False)
        self.linear2 = nn.Linear(1, 2,  bias=False)
    def forward(self, x):
        z = self.linear1(x)
        y_pred = self.linear2(z)

        return y_pred, z

model = SimpleNet().cuda()

for epoch in range(1000):
    for i, dt in enumerate(data.trn_dl):
        output = model(dt[0])
        loss2 = 0
        for j in range(0,len(output[0])):            
            l1 = torch.autograd.grad(output[0][j][0], output[1], create_graph=True)[0][j]
            l2 = torch.autograd.grad(output[0][j][1], output[1], create_graph=True)[0][j]
            loss2 = loss2 + abs(torch.sqrt(l1**2+l2**2)-1)
        loss1 = F.mse_loss(output[0], dt[1])
        loss = loss1+loss2 
    if epoch%100==0:

So it seems to work, the loss goes down, but it is very slow. I assume that the reason is that for loop. Is there a way to go around that? Thank you!