Hidden units saturate in a seq2seq model in PyTorch

Perhaps try this