For simplicity, say that output = net(input) and output is of dimension 100 x 1. During training, maybe I want to zero out the gradient for index 2 and 4 positions of output. I wonder how I could do this and still backprop the rest of the gradients through the model? I was thinking of a two-step approach where the gradient w.r.t output is computed first, then some partial gradient is zero-out, and then compute the gradients for the rest of the model parameters using the new output gradient.
You can register a hook on output that will zero-out some part of the gradients.
If I have two kinds of loss both related to the output , such as loss = loss1(output) + loss2(output), I just want to zero the grad when I use loss1 to adjust weights, how can I make hook not to influece loss2 , I am a newer , I don’t know if the situation is reasonable
You will need to clone the output:
output = your_code_here() output1 = output.clone() output1.register_hook(zero_stuff_for_loss1_only) loss = loss1(output1) + loss2(output)