How to manually implement chain rule of gradient

Busy_zhu · March 25, 2019, 2:03pm

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(1, 20, 5, 1)
        self.conv2 = nn.Conv2d(20, 50, 5, 1)
        self.fc1 = nn.Linear(4 * 4 * 50, 500)
        self.fc2 = nn.Linear(500, 10)

    def forward(self, x):
        x = F.relu(self.conv1(x))
        x = F.max_pool2d(x, 2, 2)
        x = F.relu(self.conv2(x))
        x = F.max_pool2d(x, 2, 2)
        x = x.view(-1, 4 * 4 * 50)
        x = F.relu(self.fc1(x))
        x = self.fc2(x)
        return F.log_softmax(x, dim=1)

This is the structure of Neural Network.
And I have the derivative ofoutput w.r.t. conv1.weight, conv2.weight, fc1.weight, fc2.weight…by
output.backward(torch.ones_like(output))
Then I introduce loss function f
how to use the derivative of f w.r.t. output and the derivative ofoutput w.r.t. conv1.weight... to compute the derivative of f w.r.t. conv1.weight...? (output has more than 1 element)
I mean I don’t want to compute directly by loss.backward(), but I’d like to compute separately.
Thank you.

chenyuntc · March 25, 2019, 8:35pm

output.backward(df_doutput) where df_doutput is the derivative of f w.r.t. output
Still don’t know why not just f.backward().

Busy_zhu · March 26, 2019, 12:17am

Because I need to do something with the gradient of weight, I have to compute manually.
I have the value of output.backward() (computed manually). How to combine this with df_output?