Not wrapping list of layers with a nn.ModuleList still trains the layer in backprop?


I am aware of the fact that when using a list of layers we need to wrap them in nn.ModuleList so that the parameters get registered properly. But is there any chance that they will still get gradients and be trained if I do not wrap them in a ModuleList?

Note: This is not a custom layer. They are not being registered manually either.

Eg : self.affine_layers = [nn.Linear(self.affine_layers_dim_in, self.affine_layers_dim_op) for x in range(self.features)]

Thanks in advance.

As far as I know, there is a chance if the parameters of those modules are added manually to the optimizer instead of just using net.parameters().

if you are talking about nn.Parameter() not, they are not being added manually. Again, I think nn.Parameter() is for tensor variables and would probably be used in custom layer creation.

I have updated the question and given more details.


Still the same answer holds. Unless you add the parameters of the modules in the list to the optimizer manually, the optimizer won’t know that the module exists in the model (net.parameters() won’t reveal the parameters of the model).

I see, so if they are not being revealed in net.parameters() for sure, they are not receiving the gradients and won’t be trained? There are no cases where PyTorch might be giving them gradients in any weird way possible?

This would be my last question, sorry if I am being really specific. I do not have the time to check the source code.


they might be receiving gradients, as they might be part of computation graph dynamically created during forward() function. But the parameters won’t be updated as the optimizer is not acting on those gradients.

Is there any specific behavior you see thats not consistent? It would be good to know.

There might be some inconsistent behavior. But I might not be able to demonstrate that. I just wanted to know if the weights of layers get update if they are not wrapped in a nn.ModuleList(). Seems like they do not (as I though considering the documentation and past experience). This question was a sanity check.