Calculate specific one layer gradient after the network backward

cbats · August 15, 2018, 1:05pm

After I called loss.backward() for a network, I want to calculate the gradient with another loss for a specific layer. The simplest way is to call x2=x.detach(), then call forward and backward again.

For example:

import torch
import torch.nn as nn
from copy import deepcopy

a=torch.rand((1,1,4,4))
b=nn.Conv2d(1,1,3,padding=1,bias=False)
d=nn.Conv2d(1,1,3,padding=1,bias=False)
loss1=torch.rand((1,1,4,4))
loss2=torch.rand((1,1,4,4))

c=b(a)
c2=c.detach()
e=d(c)
e2=d(c2)
e2.backward(loss)
print('detach c:', b.weight.grad, d.weight.grad)
d.weight.grad.zero_()

e.backward(loss)
print('first bp:', b.weight.grad, d.weight.grad)
b.weight.grad.zero_()
d.weight.grad.zero_()

c=b(a)
e=d(c)
e.backward(loss)
print('second bp:', b.weight.grad, d.weight.grad)

However, the results of e=d( c ) and e2=d(c2) are completely same. Is there anyway to save the second forward of layer d?

John_Smith · August 15, 2018, 1:33pm

If you use the retain_graph=True parameter, the gradients will be kept after you run a backward pass.

e.backward(loss2, retain_graph=True)

This means, you can now run another backward without getting an error.

cbats · August 15, 2018, 4:16pm

Thanks for reply.

If I only use retain_graph=True, then the gradients of each parameter in the graph will be computed. However, I only want to calculate the grad of one specific layer (e.g., Conv layer d in the example code).

InnovArul · August 15, 2018, 4:20pm

you could use the API torch.autograd.grad()

cbats · August 15, 2018, 4:36pm

That is exactly what I want, thank you very much!

Xiaofeng_Wu · August 15, 2018, 4:51pm

maybe its not so related…
if i have loss = loss1 + loss2
loss1 need to do a lot of sampling to get, is it correct to do something like this

for sample times:
     loss1 = get_loss1() / sample_times
     loss1.backward(retain_graph = True)

loss2.backward()
...
optimize.steps()

thanks!!

InnovArul · August 24, 2018, 6:03pm

In my view, for a given forward pass, you can do multiple backward passes by using retain_graph = True. If this is your scenario, it should work I guess.