Changing tensor in backward which is not for gradient calculation


I’m trying to implement something like eligibility trace(just treat it as something like a weight) in a RNN. Basically I need to

  1. use this trace weight in forward calculation.
  2. update this trace weight in the backward function.

So the problem is: if I treat it as some kind of gradient, then I need to update this gradient in each step, which means I need to use optimizer.step() function in each step and, inevitably, this will cause other gradient update in each step, which is not wanted at all.

One solution I can imagine now is to only update the “gradient” of this trace weight manually in each step, and use .step() in the end of the whole sequence.

But is there any other more elegant solution than this, like using context to get and update the trace weight?