Get derivatives from your network

smu226 · June 5, 2019, 8:29pm

Hello! I have this (simplified version) code:

grads = {}
def save_grad(name):
    def hook(grad):
        grads[name] = grad
    return hook

class SimpleNet(nn.Module):
    def __init__(self):
        super().__init__()
        self.linear1 = nn.Linear(2, 1,  bias=False)
        self.linear2 = nn.Linear(1, 2,  bias=False)
            
    def forward(self, x):
        z = self.linear1(x)
        y_pred = self.linear2(z)

        return y_pred, z

for epoch in range(1000):
    model.train()
    for i, dt in enumerate(data.trn_dl):
        optimizer.zero_grad()
        output = model(dt[0])
        
        output[1].register_hook(save_grad('z_x_hat'))
        output[0][0].backward(retain_graph=True)
        z_x_hat = grads['z_x_hat']
        
        output[1].register_hook(save_grad('z_y_hat'))
        output[0][1].backward(retain_graph=True)
        z_y_hat = grads['z_y_hat']
        
        z_x_hat.requires_grad = True
        z_y_hat.requires_grad = True
                                     
        loss = abs(torch.sqrt(z_x_hat**2+z_y_hat**2)-1)
        print(z_x_hat,z_y_hat,loss)
        loss.backward()
        
        optimizer.step()

So my network has 2 outputs (x_hat, y_hat) and I want the loss function to be given by the derivative of the outputs with respect to the value of z (I need this as part of something more complex for the actual project, but this is the part where I am stuck). In this simple case, the derivatives are just the value of the weights from z to x_hat and z to y_hat (in the real case is the product of all the partial derivatives from x_hat to z). Using the code above:

output[1].register_hook(save_grad('z_x_hat'))
output[0][0].backward(retain_graph=True)
z_x_hat = grads['z_x_hat']

I am able to get the value of this derivative (and hence the weight), but the problem is that I need to use retain_graph=True in order to do that, and that screws up my whole backprop, as I want to optimize the network using backprop only on the loss at the end, not on these intermediate calls. Can someone help me with this? Thank you!