The output of weight * input is wrong

Juna · October 6, 2018, 1:26pm

Hello, I would like to get a gradient of certain module, so I did some experiment, and
I found something that I cannot understand.
I made a model that has only one layer and one weight, and then put a single number into a layer then I tried to get a gradient of this number. so I coded like following

class fcl(nn.Module):
    def __init__(self):
        super(fcl, self).__init__()

        self.model = nn.Sequential(
            nn.Linear(1, 1)
        )

        self.model.register_backward_hook(self.hookfunc)

    def forward(self, x):
        out = self.model(x)
        return out
    
    def hookfunc(self, module, gradInput, gradoutput):
        for i in gradInput:
            print('gradInput :', i)
        for i in gradoutput:
            print('gradoutput :', i)
     



cpu_dtype = tc.FloatTensor
gpu_dtype = tc.cuda.FloatTensor

model_test = fcl()
loss_fn = nn.MSELoss()
optimizer = tc.optim.Adam(model_test.parameters(), lr=0.00001)



a = tc.Tensor([5])
b = tc.Tensor([2, 3])

print(a)
out = model_test(a)
print('out :', out)
print('\nweight :\n', model_test.model[0].weight)
print('\nbias :\n', model_test.model[0].bias)
optimizer.zero_grad()

out.backward()

and the result is this

tensor([5.])
out : tensor([-5.0639], grad_fn=<ThAddBackward>)

weight :
 Parameter containing:
tensor([[-0.9929]], requires_grad=True)

bias :
 Parameter containing:
tensor([-0.0993], requires_grad=True)
gradInput : tensor([1.])
gradInput : tensor([1.])
gradoutput : tensor([1.])

But why the gradInput and the gradoutput are 1?
I expected the gradInput would be same as weight which is -0.9929.
and I can’t understand why gradInput has 2 elements unlike gradoutput.

albanD · October 8, 2018, 12:59pm

Hi,

The backward hooks for nn.Modules are not working properly at the moment (should be fixed soon).
The gradients that you see here are not the ones you expect.
To check gradients, I would add hooks inside the forward function like:

def get_hook_fn(name)
    def hook(grad):
        print("grad for {}".format(name))
        print(grad)
    return hook

class fcl(nn.Module):
    def forward(self, x):
        x.register_hook(get_hook_fn("x"))
        out = self.model(x)
        out.register_hook(get_hook_fn("out"))
        return out

For your forward value, don’t forget that a Linear layer has bias

Juna · October 9, 2018, 2:04am

Thank you so much!
It works!!
I hope the bug will be solved soon!