nn.Parameter() not in model.named_parameters()

Talita · June 4, 2020, 9:41am

Hello!

I have an LSTM as follows (I will not copy everything for simplicity):

class LSTM(nn.Module):
    def __init__(self,
                 bert,
                 hidden_dim,
                 output_dim,
                 n_layers,
                 bidirectional, 
                 dropout, hidden_init):
        
        super().__init__()
        
    
        self.rnn = nn.LSTM(embedding_dim,
                          hidden_dim,
                          num_layers = n_layers,
                          bidirectional = bidirectional,
                          batch_first = True,
                          dropout = dropout ...., ) 
        
        (..)
        self.initial_hidden = self.init_hidden(self.batch_size)
        
    
    def forward(self, text):
          (..) 

          # initialize the hidden and cell state randomly 
           hidden = self.init_hidden(self.batch_size) 
           _, hidden = self.rnn(embedded, hidden) 
       
        return output
    
  
    def init_hidden(self, batch_size): 
        h_0 = nn.Parameter(torch.randn(self.n_layers*2, batch_size, self.hidden_dim))
        c_0 = nn.Parameter(torch.randn(self.n_layers*2, batch_size, self.hidden_dim))
        return (h_0, c_0)

However if I check the model parameters:

for name, param in model.named_parameters(): 
    if param.requires_grad: 
       print(name)

it prints:

rnn.weight_ih_l0
rnn.weight_hh_l0
rnn.bias_ih_l0
rnn.bias_hh_l0
rnn.weight_ih_l0_reverse
rnn.weight_hh_l0_reverse
rnn.bias_ih_l0_reverse
rnn.bias_hh_l0_reverse
rnn.weight_ih_l1
rnn.weight_hh_l1
rnn.bias_ih_l1
rnn.bias_hh_l1
rnn.weight_ih_l1_reverse
rnn.weight_hh_l1_reverse
rnn.bias_ih_l1_reverse
rnn.bias_hh_l1_reverse
out.weight
out.bias

Why can I not see the parameters h_0 and c_0 ??

Jonathan_Harel · June 4, 2020, 9:56am

I’m not an expert, but it seems to me that this is because self.initial_hidden is actually a tuple and not a parameter. You can try breaking it down to self.initial_h, self.initial_c = self.init_hidden(self.batch_size)

Talita · June 4, 2020, 10:27am

Thank you!! It is now added in the list model.named_parameters().

However my collab notebook crashes when I train the LSTM. I thought that there might be a device issue, but both self.initial_h and self.initial_c are on cuda.

albanD · June 4, 2020, 2:35pm

As @Jonathan_Harel mentionned, nn.Parameter that are inside python data structured cannot be explored when you look for parameters.
You can either unpack them into two different attributes or store them in a ParameterList().