For the same purpose (i.e. fixing a subset of parameters during training but updating other parameters that potentially needs to use gradients from parameters from this fixed set), I found the first method much easier to implement than writing a register_hook function (e.g. as in Update only sub-elements of weights). Thanks for the great question and the great explanation here!