How does torch module list work?

desert_ranger · December 27, 2022, 7:33pm

I am trying to understand how torch module list works. I slightly modified the example from the official documentation here -

Taken from ModuleList — PyTorch 1.13 documentation

class MyModule(nn.Module):
    def __init__(self):
        super(MyModule, self).__init__()
        self.linears = nn.ModuleList([nn.Linear(10, 10) for i in range(10)])

    def forward(self, x):
        # ModuleList can act as an iterable, or be indexed using ints
        for i, l in enumerate(self.linears):
            # Why would PyTorch give such a hard example?
            # x = self.linears[i // 2](x) + l(x)
            x = self.linears(x)
        return x

If I understand correctly, the tensor x is sent through 10 hidden layers as per the definition of self.linears. What is the for loop in forward doing?

Perhaps I am understanding the whole thing wrong?

srishti-git1110 · December 27, 2022, 7:49pm

Hi,
Your modification won’t work as you probably expect (that is, passing the tensor x through each layer) as ModuleList does not have a forward method implemented so self.linears(x) won’t work.

You might want to use x = l(x) to simplify the example given in the docs.

As for the example given in docs, I think it’s that way to probably demonstrate both ways of using the layers in a ModuleList - accessing the layers by indexing the ModuleList just how a plain python list can be indexed, and accessing by enumerating the layers.

desert_ranger · December 27, 2022, 9:10pm

Thank you for the correction. I have made the changes here -

# Taken from https://pytorch.org/docs/stable/generated/torch.nn.ModuleList.html
import torch
import torch.nn as nn

class MyModule(nn.Module):
    def __init__(self):
        super(MyModule, self).__init__()
        self.linears = nn.ModuleList([nn.Linear(10, 10) for i in range(10)])

    def forward(self, x):
        # ModuleList can act as an iterable, or be indexed using ints
        for i, l in enumerate(self.linears):
            print("i = ", i)
            print("l = ", l)
            # Why would PyTorch give such a hard example?
            # x = self.linears[i // 2](x) + l(x)
            x = l(x)
        return x

x = torch.randn(5, 10)
print("x = ", x.shape)
model = MyModule()
y = model(x)

I have a few follow-up questions to clarify my understanding -

Does self.linears = nn.ModuleList([nn.Linear(10, 10) for i in range(10)]) help create a 10 layer neural network?
Does the following code iterate over this 10 layer neural network 10 times -

 for i, l in enumerate(self.linears):

            # Why would PyTorch give such a hard example?
            # x = self.linears[i // 2](x) + l(x)
            x = l(x)

Actually, the entire purpose of asking this is to understand the code given here. The author wrote code for the original Transformer architecture from scratch. I am unable to understand the following lines -


self.layers = nn.ModuleList(
            [
                TransformerBlock(
                    embed_size,
                    heads,
                    dropout=dropout,
                    forward_expansion=forward_expansion,
                )
                for _ in range(num_layers)
            ]
        )

srishti-git1110 · December 28, 2022, 9:25am

Hi, sure.

Yes, you can use it that way.

No, only once.
This code passes the tensor x through subsequent layers of the network to produce a final prediction from the last layer.
In fact, this code is using the ModuleList as a 10-layered NN.

In the transformer code, num_layers number of transformer blocks are being stored in a ModuleList.

A module list is very similar to a plain python list and is meant to store nn.Module objects just how a plain python list is used to store int, float etc. objects. The purpose for having ModuleList is to ensure that the parameters of the layers it holds are registered properly.
The layers it contains aren’t connected in any way.