Stateless RNN vs Statefull

Michael_D · September 23, 2019, 9:49am

Does it make sense, that Stateless RNN had a better performance than a Stateful RN?

  hidden = None
    y_pred = []
    for x_i in x.tolist():

        x_i = np.array([x_i])[:, np.newaxis]

        hidden = None # Is commented in statefull case.

        x_tensor = torch.Tensor(x_i).unsqueeze(0)
        prediction, hidden = rnn(x_tensor, hidden)
        hidden = hidden.data
        prediction = prediction.detach().numpy().flatten()
        y_pred.append(prediction)
```
![34|374x500](upload://naWxCncyBFsSp6yUcCeb5NZ7JAK.png)

andreaskoepf · September 23, 2019, 10:04am

Could you explain what exactly you mean by a stateless RNN and what network topology you are using? Is rnn in your case a cell or a complete RNN? Do you pass a whole sequence or only one timestep input?

If the initial hidden state is not passed (None) internally a zero vector is used as the first hidden state. If conditioning on the initial hidden state is not beneficial it is possible that the ‘performace’ of the model is better than using an additional context vector.

Michael_D · September 23, 2019, 10:43am

I suppose it’s a complete RNN.

By Stateless, I assume that in evaluation (prediction mode) I provide hidden = None for each iteration instead of preserving it from output.

Code for RNN class:

RNN Class code

class RNN(nn.Module):
    def __init__(self, input_size, output_size, hidden_dim, n_layers):
        super(RNN, self).__init__()
        
        self.hidden_dim=hidden_dim

        # define an RNN with specified parameters
        self.rnn = nn.RNN(input_size, hidden_dim, n_layers, batch_first=True)
        
        # last, fully-connected layer
        self.fc = nn.Linear(hidden_dim, output_size)

    def forward(self, x, hidden):
        batch_size = x.size(0)

        r_out, hidden = self.rnn(x, hidden)
        r_out = r_out.view(-1, self.hidden_dim)  
        output = self.fc(r_out)
        
        return output, hidden

Michael_D · September 24, 2019, 12:53pm

Somehow I did an experiments again and I didn’t succeed to reproduce it.

As expected, stageful had a better results.