Hidden initial state gives different results

Hello everyone,

When I condition the rnn with zero vectors or any other vectors of all equal values, the results are the same. However, conditioning it with any other vectors leads to two different results.

Does anyone know why this might be? I included below the smallest code to reproduce the error.

torch.manual_seed(123)
embeddings = nn.Embedding(num_embeddings=10, embedding_dim=5, padding_idx=0)
rnn = nn.RNN(input_size=5, hidden_size=3, batch_first=True)

hidden = torch.randn(1,2,3)
x = torch.tensor([[1,2,3,0],
                  [4,2,8,4]])

lengths = torch.tensor([3,4])
lengths_sorted, indices = lengths.sort(0, descending=True)
_, idx_unsort = indices.sort(0)

inputs = x[indices]
inputs = embeddings(inputs)

packed_input = rnn_utils.pack_padded_sequence(inputs, lengths_sorted, batch_first=True)
packed_output, hn = rnn(packed_input, hidden)
output, idx = rnn_utils.pad_packed_sequence(packed_output, batch_first=True)
output = output[idx_unsort]
print(output)
print()

inputs = embeddings(x)
output, hn = rnn(inputs, hidden)
print(output)

Output of this code:

tensor([[[-0.2238,  0.4706,  0.7754],
         [ 0.5517,  0.8055,  0.9824],
         [ 0.7480,  0.0946,  0.8721],
         [ 0.0000,  0.0000,  0.0000]],

        [[-0.7347, -0.6623,  0.2718],
         [ 0.4692,  0.9355,  0.9294],
         [ 0.2216,  0.6047,  0.9834],
         [-0.6857, -0.7347,  0.4201]]], grad_fn=<IndexBackward>)

tensor([[[-0.3639,  0.4463,  0.7247],
         [ 0.5497,  0.8280,  0.9791],
         [ 0.7492,  0.0915,  0.8734],
         [-0.0648, -0.3333,  0.7004]],

        [[-0.6556, -0.6447,  0.3757],
         [ 0.4676,  0.9268,  0.9377],
         [ 0.2202,  0.6044,  0.9833],
         [-0.6857, -0.7345,  0.4194]]], grad_fn=<TransposeBackward0>)

However, if I change the initital state to:

hidden = torch.ones(1,2,3)

The two rnns give the same result:

tensor([[[-0.2183, -0.0011,  0.9200],
         [ 0.5076,  0.8225,  0.9777],
         [ 0.7486,  0.1064,  0.8680],
         [ 0.0000,  0.0000,  0.0000]],

        [[-0.6523, -0.8560,  0.7398],
         [ 0.4327,  0.9141,  0.9387],
         [ 0.2181,  0.6129,  0.9826],
         [-0.6852, -0.7349,  0.4207]]], grad_fn=<IndexBackward>)

tensor([[[-0.2183, -0.0011,  0.9200],
         [ 0.5076,  0.8225,  0.9777],
         [ 0.7486,  0.1064,  0.8680],
         [-0.0628, -0.3343,  0.7021]],

        [[-0.6523, -0.8560,  0.7398],
         [ 0.4327,  0.9141,  0.9387],
         [ 0.2181,  0.6129,  0.9826],
         [-0.6852, -0.7349,  0.4207]]], grad_fn=<TransposeBackward0>)

Thank you in advance, I’ve been stuck on this for a week now.