Need for a function to make grads None

I have a set of networks out of which only some of them are used in one forward pass and hence only their weights should be updated by a call to backward,
The caveat is that which of these networks are selected for forward pass depends on the input and hence when initialising the optimiser, I am passing parameters for all the networks.

This creates a problem while using an optimiser like Adam which keeps running average and would update grads even when they are 0.

For example - If N1 & N2 are used for first input, then their grad is initialised to a number. If in the next input networks N2 & N3 are used, then naively taking an optimiser step after zero_grad won’t prevent updates in N1, since it’s gradients would be 0 not None.

In code of Adam, a networks isn’t updated only if its grad is None`, which is as expected. But to solve the issue above, I believe it would be useful to have something like a None_grad function for the optimiser and networks.

Suggestions for any alternative methods to do this task are welcome.

No such function exist.
I am sure it could be implemented easily at the nn.Module level like zero_grad() is done. @smth do you think this is something we want in the core or more of a user specific usage?

Would it be possible to create an optimizer for each model and then select the model with the corresponding optimizer using your condition?

Yes but that would defeat the purpose of autograd in providing easy backward pass.

The networks are selected in the forward pass based on some characteristics of the input.
I will have to repeat these computations to select corresponding optimisers for each input.

Having a function that makes all grads None makes code simple