Hi,
If you try set the grad to false for a layer inside with grad enabled, does this override the with clause?
For example:
with torch.set_grad_enabled(mode = True):
self.model.lin2.requires_grad = False # Will this work?
for m in self.model.mlp_f: m.set_grad_enabled = False # Or this?
Keep in mind that set_grad_enabled is a state of the program (“do you want to keep track of gradients for outputs when the inputs require gradients”) and applies to new tensors, while someparam.requires_grad_(False) (which is the suggested form to disable gradients of tensors and parameters) says “this thing, when used as input, doesn’t require gradients”.
To then decide whether a given operation’s output requires gradients, the autograd engine checks if both gradient-mode is enabled and any inputs require gradients.
As such these are two distinct knobs you can operate independently, but they have a combined effect.
This seems to try to work on Modules, which isn’t a thing (and using requires_grad_ would tell you so, which is why it is preferred). If you wanted something like that, you should go for for p in self.model.mlp_f.parameters(): p.requires_grad_(False) which certainly works.