Do I need to both set `requires_grad=True` and only pass those parameters I want to update to the optimizer to perform layerwise training?

Hi, I am trying to train a model with transfer learning. When I set requires_grad=True for a few layers, but I send all my model parameters to the optimizer (by using model.parameters()), I see that even layers with requires_grad=False gets its weights updated. However, if instead I use filter(lambda p: p.requires_grad, net.parameters()) to set my model parameters in the optimizer, the correct weights get updated.

Is this the correct behavior, or is there some error in my code?
If this is the correct behavior, then how does my optimizer update weights when requires_grad=False?

Can you share the code snippet?

If these layers were previously updated, i.e. they got a valid gradient in previous iterations, are now frozen via .requires_grad=True, optimizers with running stats (such as Adam) can still update these parameters.
To avoid this, you could try to set their .grad attributes to None, which should skip the updates.

Thanks for the clarification. However, I am not using Adam. I am using vanilla SGD with no momentum.

In that case, could you post an executable code snippet to show this behavior, as asked by @dbp.pat94?