Initialise all model weights the same way each time


I fine-tune a transformer + linear layer on different few-shot data and I then evaluate the model on a test set. However, it looks like the model weights are being initialised differently each time. While ideally they should converge as training converges, I was wondering if it is possible to initialise all linear weights with the same values each time.


You could seed to code via torch.manual_seed before creating a new model instance. This would however initialize the global pseudorandom number generator (PRNG) and thus also define the random sequence of every other call into it.
Alternatively, you could initialize the model once, save its state_dict, and reload it afterwards for each new model.

1 Like