RuntimeError: Expected hidden size (2, 9, 100), got (2, 20, 100)

Ahsan_Habib · January 19, 2019, 1:41am

I am using pytorch for CNN and now I need to use RNN, however, after few days labor, still could not run it successfully. The network looks like below

class Net(nn.Module):
    def __init__(self, input_dim, nb_lstm_units, layer_dim, output_dim, batch_size=20):
        super(Net, self).__init__()
        self.nb_lstm_units = nb_lstm_units
        self.layer_dim = layer_dim
        self.batch_size = batch_size

        self.rnn = nn.RNN(input_dim, nb_lstm_units, layer_dim, batch_first=True, nonlinearity='relu')
        self.fc_out = nn.Linear(nb_lstm_units, output_dim)        
    
    def init_hidden(self):
        hidden_a = torch.randn(self.layer_dim, self.batch_size, self.nb_lstm_units)        
        hidden_a = hidden_a.cuda()            
        return Variable(hidden_a)
        
    def forward(self, X): 
        batch_size, seq_len, _ = X.size()

        # dimension (batch, n_features, timestamp) -> (batch, timestamp, n_features)
        X = X.transpose(1, 2)

        self.hidden = self.init_hidden()
        out, self.hidden = self.rnn(X, self.hidden)

        out = self.fc_out( out[:, -1, :] )
        
        return out


# Create RNN
input_dim = 1
nb_lstm_units = 100
layer_dim = 2
output_dim = 2
    
model = Net(input_dim, nb_lstm_units, layer_dim, output_dim)

Input is 1D ECG signal segmented by 217 samples, batch size 20. The forward is passed and I can print the output of self.fc_out but the error is shown afterward. Hidden output of RNN need extra processing?

Konstantin_Solomatov · January 20, 2019, 4:16am

Why do you init_hidden in forward? If you want them to be parameter of your model, you should init them in the constructor.

Ahsan_Habib · January 20, 2019, 4:29am

As each data segment is passed to forward one at a time, I want RNN not to link the current data segment with previous segments. Plz correct if I am wrong. Several RNN tutorials (like this), I found doing in similar way.

Konstantin_Solomatov · January 20, 2019, 5:18am

In the tutorial, you linked there’re two differences:

They initialize with zeros, you initialize with random values
They don’t store it as the module’s parameters but you do

Ahsan_Habib · January 20, 2019, 7:44am

Changing the initialisation and rnn call, as suggested, produces same error.

h0 = self.init_hidden()
out, h1 = self.rnn(X, h0)

Ahsan_Habib · January 21, 2019, 12:43pm

Is this problem is kind of an open issue or I am missing something? I had to move to Keras to do same and it works fine. But pytorch gives much freedom to play with my code.

Any help is appreciated.

Ahsan_Habib · January 21, 2019, 3:40pm

In forward(), using null value made the error message disappear.

self.rnn(X, None)

Now, the training loss decreases nicely but suddenly becomes a huge positive value. Following other examples, I can see that this is not the ideal way. Any suggestion to handle hidden value and initial weight is much appreciated.