Michael_D
(Michael D)
September 23, 2019, 9:49am
1
Does it make sense, that Stateless RNN had a better performance than a Stateful RN ?
hidden = None
y_pred = []
for x_i in x.tolist():
x_i = np.array([x_i])[:, np.newaxis]
hidden = None # Is commented in statefull case.
x_tensor = torch.Tensor(x_i).unsqueeze(0)
prediction, hidden = rnn(x_tensor, hidden)
hidden = hidden.data
prediction = prediction.detach().numpy().flatten()
y_pred.append(prediction)
```
![34|374x500](upload://naWxCncyBFsSp6yUcCeb5NZ7JAK.png)
Could you explain what exactly you mean by a stateless RNN and what network topology you are using? Is rnn in your case a cell or a complete RNN? Do you pass a whole sequence or only one timestep input?
If the initial hidden state is not passed (None) internally a zero vector is used as the first hidden state. If conditioning on the initial hidden state is not beneficial it is possible that the ‘performace’ of the model is better than using an additional context vector.
Michael_D
(Michael D)
September 23, 2019, 10:43am
3
I suppose it’s a complete RNN.
By Stateless, I assume that in evaluation (prediction mode) I provide hidden = None
for each iteration instead of preserving it from output.
Code for RNN class:
RNN Class code
class RNN(nn.Module):
def __init__(self, input_size, output_size, hidden_dim, n_layers):
super(RNN, self).__init__()
self.hidden_dim=hidden_dim
# define an RNN with specified parameters
self.rnn = nn.RNN(input_size, hidden_dim, n_layers, batch_first=True)
# last, fully-connected layer
self.fc = nn.Linear(hidden_dim, output_size)
def forward(self, x, hidden):
batch_size = x.size(0)
r_out, hidden = self.rnn(x, hidden)
r_out = r_out.view(-1, self.hidden_dim)
output = self.fc(r_out)
return output, hidden
Michael_D
(Michael D)
September 24, 2019, 12:53pm
4
Somehow I did an experiments again and I didn’t succeed to reproduce it.
As expected, stageful had a better results.