I have a network(dnn_Linear) output(v3). I first like to estimate a variables ss, which is Jacobian of this network w.r.t. to its parameters. Then I would like to backpropagate the loss, and later mutate gradients by multiplying with ss.
Now i get the error: Trying to backward through the graph a second time"
The part of script is:
ss = torch.autograd.grad(v3,dnn_Linear.parameters(), grad_outputs=torch.ones_like(v3),retain_graph=None, create_graph=False, only_inputs=True, allow_unused=True)#jacobian
Total_loss.backward(retain_graph=True)#back propagate
v3.grad = torch.tensor(ss).T*v3.grad #mutate part
torch.autograd.grad returns the sum of gradients of outputs with respect to the inputs as a tuple.
If a single input is passed, the tuple will contain a single element:
model = nn.Linear(10, 10)
x = torch.randn(1, 10)
out = model(x)
grads = torch.autograd.grad(out.mean(), model.bias)
grads
# (tensor([0.1000, 0.1000, 0.1000, 0.1000, 0.1000, 0.1000, 0.1000, 0.1000, 0.1000,
# 0.1000]),)
The grads tuple will correspond to the same order dnn_Linear.parameters() is returning the parameters, so you could iterate both and apply your update rule.
Thanks. It would help if there is an example of ss multiplying with dnn.Linear.parameters(). ss is tuple of size 2. Does that mean dnn.Linear.parameters() is also a tuple of same size? If yes, how can we multiply both before optimizer.step()
model = nn.Linear(10, 10)
x = torch.randn(1, 10)
out = model(x)
grads = torch.autograd.grad(out.mean(), model.parameters())
for grad, param in zip(grads, model.parameters()):
with torch.no_grad():
param.sub_(grad)
but you won’t be able to use an optimizer anymore, since you are now manually manipulating the parameters, making the forward activations stale.
@ptrblck , thanks for your help. The only thing, I would like to add here, is that I wanted to multiply, and yet double propagate.
so my correct script after your help is
Total_loss.backward(retain_graph=True)#back propagate
for grad, param in zip(grads, model.parameters()):
pp=param*grad # i am retaining graph, so inplace operation would not work
param=pp