I am registering hooks to update convolutional layer in this way
for i, conv_layer in enumerate(model.convos):
conv_layer.weight.retain_grad()
print("conv_layer.weight.shape ", conv_layer.weight.shape)
print("Ms[i].shape ", Ms[i].shape)
h = conv_layer.weight.register_hook(lambda grad: convo_grad_mult(grad, Ms[i]))
The output of print statements are
conv_layer.weight.shape torch.Size([17, 3, 3, 3])
Ms[i].shape torch.Size([3, 3])
conv_layer.weight.shape torch.Size([35, 17, 3, 3])
Ms[i].shape torch.Size([17, 17])
conv_layer.weight.shape torch.Size([9, 35, 3, 3])
Ms[i].shape torch.Size([35, 35])
Then during the hook registration within function convo_grad_mult
I print the sizes of arguments:
> M torch.Size([35, 35])
> grad torch.Size([35, 81])
> M torch.Size([35, 35]) **#here I am getting the error since I expect M of size [17, 17]**
> grad torch.Size([17, 315])
Why am I receiving the same M again while it is clear that I should have received M of size [17, 17]?
Here is full training loop.
EDIT:
Much simpler setting to reproduce the issue
def dummy(arg):
print(arg)
…
for i, conv_layer in enumerate(model.convos):
print("i={}".format(i))
print("conv_layer.weight.shape ", conv_layer.weight.shape)
h = conv_layer.weight.register_hook(lambda grad: dummy(i))
hooks.append(h)
Output:
i=0
conv_layer.weight.shape torch.Size([17, 3, 3, 3])
i=1
conv_layer.weight.shape torch.Size([35, 17, 3, 3])
i=2
conv_layer.weight.shape torch.Size([9, 35, 3, 3])
after calling backward() I get
2
2
2
Why does it only register last one?
Clearly I register hooks for different convo layers
If I do
h1 = model.convos[0].weight.register_hook(lambda grad: dummy(0))
h2 = model.convos[1].weight.register_hook(lambda grad: dummy(1))
h3 = model.convos[2].weight.register_hook(lambda grad: dummy(2))
Then I get correct output
2
1
0
Why it does not work with a loop?