Hi @ptrblck , from another discussion I read that you suggest using no_grad()
context when modifying model parameters. But, in my case, here I didn’t use it because I want B to be updated by the optimizer through add_
operation.
I tried to use no_grad()
, and it made the training time more stable (not slowing), but as I expected, it made B not update. I am not sure which one is close to the solution.