I read some posts about ModuleList and all of them said that adding modules to ModuleList gives access to parameters of the Neural Network but in “Training a classifier” example of 60 mins pytorch tutorial the modules are not added to any ModuleList and still the parameters could be accessed using
optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)
This is confusing. Please clarify how the parameters are accessible even though the modules have not been added to any ModuleList
You don’t need an
nn.ModuleList to register the parameters properly.
Could you link the topic, where this is stated, please?
The parameters are registered with the assignment in your model:
self.fc = nn.Linear(1, 1) # here all parameters of the linear layers are registered
self.register_buffer('my_buffer', torch.tensor(1)) # here the buffer is registered
__setattr__ method in
nn.Module will take care of it.
Note that plain tensors (not
nn.Modules or buffers) will not be registered and will thus not be returned in
I am referring to this thread of discusion - The difference in usage between nn.ModuleList and python list
In this thread smth said
When you ask for
model.parameters() , the parameters of layers inside
nn.ModuleList will be returned. But if it is regular list, we dont look inside the list for
I found one more thread - When should I use nn.ModuleList and when should I use nn.Sequential?
Here you said
Exactly! If you use a plain python list, the parameters won’t be registered properly and you can’t pass them to your optimizer using
Yes, that’s the difference between a Python
list and an
As explained in the linked topics, the parameters wrapped in a plain
list won’t be registered, while the parameters from all modules inside an
nn.ModuleList will be registered.
So if you want to use a list-like container, then the answer to the initial question is: yes, it’s mandatory to use
nn.ModuleList instead of
list to register all parameters.
If you don’t need a list-like container and just want to register parameters or layers inside an
nn.Module class, then you can just use the assignment.
Sorry, I am unable to understand the list-like container scenario. Even if a python list contains let’s say 2 linear layers. We would be iterating the list and using those Linear Layers in the forward method of a user defined class(sub classed from nn.Module). Wouldn’t that work?
It would work to use these layers in the forward pass, but
model.parametes() will not return the parameters of these layers (and thus
optim.SGD(model.parameters(), lr=1.) will not see these parameters either).
Also transferring the parameters to a device via
model.to('cuda') won’t transfer these parameters to the desired device.
So, if a module is added to a python list its parameters get invisible. Thanks for clearly explaining me this concept. I suggest this should be clearly documented with example in pytorch’s documentation of nn.ModuleList.
nn.ModuleList mentions the advantage here:
ModuleList can be indexed like a regular Python list, but modules it contains are properly registered, and will be visible by all
If you think this description is still confusing, would you be interested in creating a PR with an improvement?
Yes sure. I have never contributed to any open source yet even if it were some documentation.
Then this is the best time to start doing it!
You could create a feature request here, explain your confusion and your suggestion, and wait until the module owners start the discussion.
I created an issue #38639