What would be an easy way of retrieving the LSTM output for all layers

What would be an easy way of retrieving the LSTM output for all layers and for all steps, that is, not only the last layer, as it is done by default?

You’ll have to build your own network out of calls to nn.LSTMCell. The CUDNN kernel that serves as the backend to nn.LSTM doesn’t return the intermediate activations you’re looking for.

as an example you could check this example in ONMT - https://github.com/pytorch/examples/blob/master/OpenNMT/onmt/Models.py#L44