Not wrapping list of layers with a nn.ModuleList still trains the layer in backprop?

abhinavshaw · February 18, 2019, 6:34pm

Hi

I am aware of the fact that when using a list of layers we need to wrap them in nn.ModuleList so that the parameters get registered properly. But is there any chance that they will still get gradients and be trained if I do not wrap them in a ModuleList?

Note: This is not a custom layer. They are not being registered manually either.

Eg : self.affine_layers = [nn.Linear(self.affine_layers_dim_in, self.affine_layers_dim_op) for x in range(self.features)]

Thanks in advance.

InnovArul · February 18, 2019, 6:46pm

As far as I know, there is a chance if the parameters of those modules are added manually to the optimizer instead of just using net.parameters().

abhinavshaw · February 18, 2019, 7:11pm

if you are talking about nn.Parameter() not, they are not being added manually. Again, I think nn.Parameter() is for tensor variables and would probably be used in custom layer creation.

I have updated the question and given more details.

Thanks!

InnovArul · February 18, 2019, 8:01pm

Still the same answer holds. Unless you add the parameters of the modules in the list to the optimizer manually, the optimizer won’t know that the module exists in the model (net.parameters() won’t reveal the parameters of the model).

abhinavshaw · February 18, 2019, 8:07pm

I see, so if they are not being revealed in net.parameters() for sure, they are not receiving the gradients and won’t be trained? There are no cases where PyTorch might be giving them gradients in any weird way possible?

This would be my last question, sorry if I am being really specific. I do not have the time to check the source code.

Best
AB

InnovArul · February 18, 2019, 8:09pm

they might be receiving gradients, as they might be part of computation graph dynamically created during forward() function. But the parameters won’t be updated as the optimizer is not acting on those gradients.

Is there any specific behavior you see thats not consistent? It would be good to know.

abhinavshaw · February 18, 2019, 8:14pm

There might be some inconsistent behavior. But I might not be able to demonstrate that. I just wanted to know if the weights of layers get update if they are not wrapped in a nn.ModuleList(). Seems like they do not (as I though considering the documentation and past experience). This question was a sanity check.

Thanks
AB