Nested Modules in PyTorch and the parameters update

Suppose that I defined a class as follows:

class A(nn.Module):
    def __init__(self, some_params):
          #SOMETHING HERE (for example several Linear)
    def forward(self, x):

Now, suppose I want to use an array of class A in another model. Does the following way work?

class B(nn.Module):
    def __init__(self, some_params):
          self.a = [ ]
          for i in range(100):
    def forward(self, x):
          outs = [ ]
          for i in range(100):
                 out = self.a(x[ i ])
           outs_concat = outs[0]
           for i in range(1,100):
                 outs_concat =, outs[ i ]), 1)

Moreover, would someone please tell me how can I be sure about the parameters? Do the parameters in module A act as paramters of module B? Because I want to update all parameters when I call optimizer and I do not know how to become sure about updating all parameters.
I am looking forward to see your solutions.

This won’t work, as your A module won’t be registered in a plain Python list, so you would have to use a nn.ModuleList instead.
This will make sure to register the passed module as well as all parameters in the parent module.
Calling B.parameters() will thus yield all parameters.
You could double check it via print(dict(modelB.named_parameters())).

Would you please tell me how can I iterate ModuleList in forward method?

You can iterate it like a plain Python list. It’s a drop-in replacement, which makes sure to register all parameters.

Whenever I use the following code in “forward” method of class B, it does not work for multi-instances:

for i in range(10):
    out = self.a[i](X[i])

My mean is that suppose the input array be in shape (batch_size, 5, 10) and batch_size = 3
then, when I input the input array into instance of class B, it will rise an error of array index out of bound.

b = B()
y_hat = b(input_array[index : index + batch_size])

Would you please help me to know how to work with batches in this case?

I assume you would like to feed the batch to the first layer and then pipe the output through all others?
If so, then this code snippet should work:

out = self.a[0](X)
for i in range(1, 10):
    out = self.a[i](out)

Actually, each instance (sample) of X has different features and each feature must feed to each self.a[ i ] separately, then I would like to concatenate all of their outputs.

What’s the shape of X?
If it’s [batch_size, nb_features=10], then this should work:

out = []
for i in range(10):
    out.append(self.a[i](X[:, i])
out = torch.stack(out)

In this example, we have (batch_size, 5, 10) input.
It means that I have batch_size samples, 5 features, and each feature is a vector of size 10.
Now, I want to feed each feature i to its corresponding self.a[ i ].