What's the difference between nn.ModuleList() and python list?

ljh · December 17, 2020, 8:45am

Hi! I was constructing a multilayer LSTM by stacking a bunch of LSTM in an list.
I found out that when I’m using the python list, the loss is weird.
But when I’m using the nn.ModuleList(), it’s just normal.
So why can’t I use the python list? Do I have to use nn.Modulelist()?

Here’s the code and loss when using nn.Modulelist()

class MultiLayerLSTM(nn.Module):
    def __init__(self, input_size, hidden_size,num_layers,batch_first):
        super(MultiLayerLSTM, self).__init__()
        self.num_layers = num_layers
        self.LSTMs = nn.ModuleList()
        self.LSTMs.append(NaiveLSTM(input_size, hidden_size, batch_first=True))
        for i in range(num_layers-1):
            self.LSTMs.append(NaiveLSTM(hidden_size, hidden_size, batch_first=True))

    def forward(self,x,state):
        (h0s,c0s) = state
        for i in range(self.num_layers):
            x,_ = self.LSTMs[i](x,(h0s[i].unsqueeze(0),c0s[i].unsqueeze(0)))
        return x,_

Epoch [1/2], Step [100/600], Loss: 0.6575
Epoch [1/2], Step [200/600], Loss: 0.4976
Epoch [1/2], Step [300/600], Loss: 0.1577
Epoch [1/2], Step [400/600], Loss: 0.2070
Epoch [1/2], Step [500/600], Loss: 0.0443
Epoch [1/2], Step [600/600], Loss: 0.1664
Epoch [2/2], Step [100/600], Loss: 0.0811
Epoch [2/2], Step [200/600], Loss: 0.1011
Epoch [2/2], Step [300/600], Loss: 0.1325
Epoch [2/2], Step [400/600], Loss: 0.1050
Epoch [2/2], Step [500/600], Loss: 0.1075
Epoch [2/2], Step [600/600], Loss: 0.0404

Here’s the code and loss when using python list:

class MultiLayerLSTM(nn.Module):
    def __init__(self, input_size, hidden_size,num_layers,batch_first):
        super(MultiLayerLSTM, self).__init__()
        self.num_layers = num_layers
        self.LSTMs = []
        self.LSTMs.append(NaiveLSTM(input_size, hidden_size, batch_first=True))
        for i in range(num_layers-1):
            self.LSTMs.append(NaiveLSTM(hidden_size, hidden_size, batch_first=True))

    def forward(self,x,state):
        (h0s,c0s) = state
        for i in range(self.num_layers):
            x,_ = self.LSTMs[i](x,(h0s[i].unsqueeze(0),c0s[i].unsqueeze(0)))
        return x,_

Epoch [1/2], Step [100/600], Loss: 2.2161
Epoch [1/2], Step [200/600], Loss: 2.1421
Epoch [1/2], Step [300/600], Loss: 2.1189
Epoch [1/2], Step [400/600], Loss: 2.0571
Epoch [1/2], Step [500/600], Loss: 2.1516
Epoch [1/2], Step [600/600], Loss: 2.0326
Epoch [2/2], Step [100/600], Loss: 1.9656
Epoch [2/2], Step [200/600], Loss: 1.9204
Epoch [2/2], Step [300/600], Loss: 1.9620
Epoch [2/2], Step [400/600], Loss: 1.8612
Epoch [2/2], Step [500/600], Loss: 1.8466
Epoch [2/2], Step [600/600], Loss: 1.8985

ptrblck · December 17, 2020, 8:50am

Plain Python lists won’t register the module properly, so that e.g. model.parameters() will not return the internal parameters of the submodules in this list (and thus your optimizer won’t get them).
You shoud use nn.ModuleList instead as described in the docs.