Adjusting parameters in the forward pass

Given some arbitrary model, I would like to manipulate each parameter p before it is involved in a computation. I’m interested in multiplying each p by some learnable constant c. I can’t manually adjust each parameter after the forward pass is completed because then the learnable constant would not be a part of the computation graph. Ideally, the operation p*c would happen before p is used in the forward pass. Then, during the backwards pass, both p and c would receive a gradient wrt the loss.

I looked into hooks to achieve this, but all the posts I have seen so far just use hooks to print out the underlying tensor or gradient information. Is there a way to use hooks in the forward pass to adjust parameters or is this not possible? I could manually define each parameter in the network, and then have control over it; however, my current model has a lot of convolutional layers, so I would prefer not to do this manual work.


I tried to define hooks to achieve it but encountered some problem.

c = nn.Parameter(torch.randn(1), requires_grad=True)
opt = torch.optim.SGD(model.parameters(), lr=0.001)
opt.add_param_group({'params': c})

def hooks(module, input):
    module.weight = nn.Parameter(module.weight * c)

for module in model.children():
    if isinstance(module, nn.Conv2d):

c is defined as nn.Parameter and if we want to modify module.weight we must pass nn.Parameter or None to it, then registered it to each child_module.
But, c is still not a learnable param, and what I found is:

  • module.weight is leaf variable, it is okay since it used to update.
  • c is also leaf variable, it means that gradient flow will not pass back to c? and c is not included in the computation graph?

you can do something similar to weight normalization pytorch/ at master · pytorch/pytorch · GitHub , it will involve in every forward/backward pass those learnable constants