Differences between nn.RNN and nn.RNNCell

I am systematically encountering a strange behavior at the beginning of the predicted sequence when using torch.nn.RNN. This phenomena is not present when using torch.nn.RNNCell.

The code to reproduce this behavior can be found in: PyTorch-examples/Signal2Signal.ipynb at master · landajuela/PyTorch-examples · GitHub by switching cell_model = True or False in the training cell.

It seems an initialization problem, but I am not able to correct it.