Nested Modules in PyTorch and the parameters update

sinaabdollahi · February 25, 2020, 3:28am

Suppose that I defined a class as follows:

class A(nn.Module):
    def __init__(self, some_params):
          #SOMETHING HERE (for example several Linear)
    def forward(self, x):
          #SOMETHING ELSE HERE

Now, suppose I want to use an array of class A in another model. Does the following way work?

class B(nn.Module):
    def __init__(self, some_params):
          self.a = [ ]
          for i in range(100):
                 self.a.append(A(some_params))
    def forward(self, x):
          outs = [ ]
          for i in range(100):
                 out = self.a(x[ i ])
                 outs.append(out)
           outs_concat = outs[0]
           for i in range(1,100):
                 outs_concat = torch.cat((outs_concat, outs[ i ]), 1)

Moreover, would someone please tell me how can I be sure about the parameters? Do the parameters in module A act as paramters of module B? Because I want to update all parameters when I call optimizer and I do not know how to become sure about updating all parameters.
I am looking forward to see your solutions.
Regards,
Sina

ptrblck · February 25, 2020, 5:27am

This won’t work, as your A module won’t be registered in a plain Python list, so you would have to use a nn.ModuleList instead.
This will make sure to register the passed module as well as all parameters in the parent module.
Calling B.parameters() will thus yield all parameters.
You could double check it via print(dict(modelB.named_parameters())).

sinaabdollahi · February 25, 2020, 6:04am

Thanks.
Would you please tell me how can I iterate ModuleList in forward method?
Thanks

ptrblck · February 25, 2020, 6:12am

You can iterate it like a plain Python list. It’s a drop-in replacement, which makes sure to register all parameters.

sinaabdollahi · March 2, 2020, 7:11am

Whenever I use the following code in “forward” method of class B, it does not work for multi-instances:

for i in range(10):
    out = self.a[i](X[i])

My mean is that suppose the input array be in shape (batch_size, 5, 10) and batch_size = 3
then, when I input the input array into instance of class B, it will rise an error of array index out of bound.

b = B()
y_hat = b(input_array[index : index + batch_size])

Would you please help me to know how to work with batches in this case?

ptrblck · March 2, 2020, 7:18am

I assume you would like to feed the batch to the first layer and then pipe the output through all others?
If so, then this code snippet should work:

out = self.a[0](X)
for i in range(1, 10):
    out = self.a[i](out)

sinaabdollahi · March 2, 2020, 7:21am

Actually, each instance (sample) of X has different features and each feature must feed to each self.a[ i ] separately, then I would like to concatenate all of their outputs.

ptrblck · March 2, 2020, 7:23am

What’s the shape of X?
If it’s [batch_size, nb_features=10], then this should work:

out = []
for i in range(10):
    out.append(self.a[i](X[:, i])
out = torch.stack(out)

sinaabdollahi · March 2, 2020, 7:24am

In this example, we have (batch_size, 5, 10) input.
It means that I have batch_size samples, 5 features, and each feature is a vector of size 10.
Now, I want to feed each feature i to its corresponding self.a[ i ].