Calculating derivative of loss w.r.t. single parameter

NondairyCreamer · August 18, 2021, 11:31pm

The following code correctly calculates the derivative of the full_model w.r.t each parameter in x

import torch

def loss(x): return torch.sum(torch.pow(x, 3))
x = torch.tensor([1, 3, 5], dtype=torch.float64, requires_grad=True)
loss_out = loss(x)
first_derivative = torch.autograd.grad(loss_out, x, create_graph=True)[0]

first derivative = [3, 27, 75] = 3*x^2

However, if I want to only calculate the derivative with respect to a single variable, running with a modified final line throws an error
first_derivative = torch.autograd.grad(loss_out, x[0], create_graph=True)[0]
RuntimeError: One of the differentiated Tensors appears to not have been used in the graph. Set allow_unused=True if this is the desired behavior.

I must be missing something quite basic about pytorch.
Any indexing I perform of x (such as x[0:3]) returns this error, claiming that the variable is not used in the graph.

Can anyone help me see what I’ve done wrong?

Thanks

soulitzer · August 21, 2021, 2:50am

Any indexing you do on x creates a new view tensor of the base tensor x. In terms of gradients, the flow is one directional, i.e., if the view is used in the loss, the gradients will flow back to the base, but if you modify the base only (which isn’t actually aware of any of the views), the gradient wrt to the view is not updated. What you could do instead is index after computing the gradient (x.grad[0]), or compute the loss using the view instead of the base.