Sorry for posting a wrong solution before.
The reason it does not work as expected is because python’s in tries to use ==, and that will not identify tensors.
Using the parameter names will work (you could also hack around it by keeping a set of p.data_ptr() and filter by that, but that is ugly…):
fc_params = [p for n,p in m.named_parameters() if not n.startswith('fc.')]
other_params = [p for n,p in m.named_parameters() if n.startswith('fc.')]
Do you think that instead of raising error parameters appear in more than one parameter group if we support overriding the learning rate it would be simpler?
Like for my use case it woulf have been much simpler, ofcourse it creates possibility of mistakes from user side but in my opinion more positive than negative!
To be honest, I think that is is a very special application where you need this and don’t have it conveniently available.
For example, (I think) the fast.ai library (Jeremy Howard advocates a graded learning rate for finetuning) library sticks the various modules in a Sequential module and then gets the parameter groups by iterating over the submodules.
The other option is to use the parameter names, there probably are more elegant solutions than the above if you need it in a systematic way.