Changing weight after forward and before backward

Is it ok to change weights after the forward pass and before the backward pass?

Could you be more specific on how you change weights?

Yes. I set the weights to some constant pre-defined matrix (for instance a matrix of ones or a matrix with random entries).

Did you do it as layer.weight = xxx or something like layer.weight.data.copy_(xxx)?

layer.weight.data = randn(layer.weight.data.size())

It seems if I use what you suggested,
layer.weight.data.copy_(ones(layer.weight.data.size()))
then it works but if I use
layer.weight.data.copy_(sign(randn(layer.weight.data.size())))
then the backprop returns NaN.

Never change the .data attribute of a Variable. It breaks the invariant.

Yeah, it will break because we shouldn’t do that. One should never directly operate with .data of Variable.

layer.weight = xxx will be fine.

My goal is to change the weights after the forward pass but before the backward.
If I use layer.weight = xxx then the backward pass ignores the new weights and uses the old ones. It seems that directly writing to layer.weight.data, which you advice against, is the only option that works.

The problem is that the weights can be used when calculating their own gradients, so if you modify the weights before the backward pass then the calculated gradients could be wrong, but maybe that is what you want.

Only by examining the calculation that the weights are involved in, and their size during the forward pass, could we hope to understand why you are getting NaNs. If you are trying to implement a paper, it may help us to know which one.

See @jpeg729 's reply below. Also, the gradient won’t be valid if you change it like that. If you want to apply the gradient with one set of params on another set, just copy .grad after backward.