Hello, I would like to get a gradient of certain module, so I did some experiment, and
I found something that I cannot understand.
I made a model that has only one layer and one weight, and then put a single number into a layer then I tried to get a gradient of this number. so I coded like following
class fcl(nn.Module):
def __init__(self):
super(fcl, self).__init__()
self.model = nn.Sequential(
nn.Linear(1, 1)
)
self.model.register_backward_hook(self.hookfunc)
def forward(self, x):
out = self.model(x)
return out
def hookfunc(self, module, gradInput, gradoutput):
for i in gradInput:
print('gradInput :', i)
for i in gradoutput:
print('gradoutput :', i)
cpu_dtype = tc.FloatTensor
gpu_dtype = tc.cuda.FloatTensor
model_test = fcl()
loss_fn = nn.MSELoss()
optimizer = tc.optim.Adam(model_test.parameters(), lr=0.00001)
a = tc.Tensor([5])
b = tc.Tensor([2, 3])
print(a)
out = model_test(a)
print('out :', out)
print('\nweight :\n', model_test.model[0].weight)
print('\nbias :\n', model_test.model[0].bias)
optimizer.zero_grad()
out.backward()
and the result is this
tensor([5.])
out : tensor([-5.0639], grad_fn=<ThAddBackward>)
weight :
Parameter containing:
tensor([[-0.9929]], requires_grad=True)
bias :
Parameter containing:
tensor([-0.0993], requires_grad=True)
gradInput : tensor([1.])
gradInput : tensor([1.])
gradoutput : tensor([1.])
But why the gradInput and the gradoutput are 1?
I expected the gradInput would be same as weight which is -0.9929.
and I can’t understand why gradInput has 2 elements unlike gradoutput.