As a sanity check, can you actually save (e.g., with something like (.detatch().clone()) (or print) the values before and after to compare them? I have a suspicion that a and b are the same reference so they are both updated as the model is being trained.
At this point, you (presumably) have already created your optimizer, so
your optimizer contains a collection of Parameters that it will be updating.
Because you have overwritten weight with a new Parameter (rather
than having overwritten weight.data with a new tensor) – and done so
after creating your optimizer – your optimizer does not contain the new Parameter that is used (via the forward pass) to calculate the loss.
(The Parameter that is modified by the optimizer-- or would be if it had
a non-trivial gradient – is the old Parameter that you have, in a sense, “hidden.”)
So self.hidden_layer_1.weight doesn’t change, because it is not in
the optimizer’s list of Parameters to update.
[/quote]
Why the loss is changing even though self.hidden_layer_1.weight
is not: I will assume that self.hidden_layer_1 is a Linear. A Linear
has both a weight and a bias. You haven’t overwritten bias, so it is
still updated by your optimizer, and still affects the loss you calculate.
Note that even after fixing the optimizer issue, the weights should not be compared using references.
For example, this code snippet will report the “saved” weights being the same because both references refer to the same underlying tensor. Copying the weights before training will produce the expected results:
I do create my optimizer before this statement, you are correct.
So if I understand correctly, self.hidden_layer_1.weight.data = my_custom_weights should fix it?