Hi,
I want to change the grad of nn.Conv2d in its backward, here is my code:
def fun(module,grad_in,grad_out):
print('grad_in')
print(grad_in.shape) # add break point here
print('grad_out')
print(grad_out.shape)
class testNet(nn.Module):
def __init__(self):
super(testNet, self).__init__()
self.l1=nn.Conv2d(2,5,3)
self.l2=nn.Conv2d(5,1,3)
self.l3=nn.Linear(28*28,1)
self.l1.register_backward_hook(fun)
#self.l2.register_backward_hook(fun)
initialize_weights(self)
def forward(self, input):
x=self.l1(input)
x=self.l2(x)
print(x)
return self.l3(x.view(2,-1))
if __name__=='__main__':
input=torch.randn(2,2,32,32)
net=testNet()
o=net(input)
print(o.shape)
# 2 1 28 28
o.backward(torch.ones_like(o))
I use the debug of pycharm to figure out the shape of grad_in and grad_out of self.l1, and I find that the grad_in is a tuple with size of 3, grad_out is a tuple with size of 1.
The grad_in is:
(None,(5,2,3,3),(5))
and the grad_out is:
(2,5,30,30)
And if I uncomment self.l2.register_backward_hook(fun), then I will meet the grad of self.l2 first in debug.
The grad_in is:
((2,5,30,30),(1,5,3,3),(1))
and the grad_out is:
(2,1,28,28)
The questions are:
1.What does grad_in[2] of nn.Conv2d use for?
2.I tried to return grad_in2, grad_out2 in function fun but an error will raise for the fun should return 3 objects not 2, why is 3 objects?
3.What should I do if I need to store the history of grad which will be used as an information to make the new grad?
4.The thing I want to achieve is giving new grad of the kernels of conv2d according to its history of grad and the current grad. I guess the grad_in[2] maybe a factor that will affect the updating speed of different kernels. Is it enough if I only changed this grad to achieve my goal?
Any help will be appreciated!