I am wondering what I am doing wrong with register_hook since it does not seem to register a hook.
My goal is to modify grad matrix before weights are updated. For debugging purposes I just assigned zero matrix for a grad variable, but network trains perfectly.
So clearly hooks don’t work, but I cannot figure why. Here is my train function:
model = Net()
criterion = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=lr, momentum=momentum, weight_decay=weight_decay)
for e in range(epochs):
agg_loss = 0
for data in trainloader:
x, y = data
x = x.cuda()
y = y.cuda()
outputs = model(x)
loss = criterion(outputs, y)
hooks = 
for conv_layer in model.convos:
h = conv_layer.weight.register_hook(lambda grad: torch.zeros(size=grad))
for fc_layer in model.linears:
h = fc_layer.weight.register_hook(lambda grad: torch.zeros(size=grad))
for h in hooks:
Disregard this post, as it was a wrong suggestion.
ohh, I see… I thought I could just create a function with return value something like
lambda grad: func(grad) to update weights, because I need to perform actions on the gradient that are not attribute functions. Is this possible with register hooks?
Function would be a series of different matrix manipulations.
No, sorry. I’m wrong and your code snippet should be the right way and also seems to work:
model = nn.Linear(1, 1)
model.weight.register_hook(lambda grad: torch.ones_like(grad) * 1000)
I’ll edit my previous post.
wait, then what is my problem? It definitely did not work for me using my training code, it just trained as usual, while it was supposed to fail the training…
optimizer.backward() then right way of doing it?
I think your code registers the hooks too late (after the
Note that the hook will be called during the gradient calculation.
You could register the hooks during setup and use the standard training loop without removing and adding the hooks again.