In that case you could create a “fixed” and a “trainable” tensor in a custom linear layer and concatenate them in each forward
pass. This would make sure that only the trainable part gets valid gradient and the parameter updates.
This post gives you an example of such a layer and replaces it via torch.fx
in another model.