Passing generated output as input to t+1 without a for loop

Is it possible to make lets say LSTMs in pytorch to pass the generated output as input to the next time step without using a for loop?
For example, if the decoder_input_emb in the following example is a 3d tensor (batch, seq_leng, dim) , the LSTM will always take the the input from the decoder_input_emb not from the generated output.

outputs, hidden = nn.LSTM(decoder_input_emb, hidden)

Is there a way to force the LSTM to take its out prediction as next input?

I know that it can be done in a loop like this (simplified):

decoder_input = decoder_input_emb[:, 0]
  	for di in range(max_length):
  		step_output, hidden = nn.LSTM(decoder_input, hidden)
  		decoder_input = step_output

But this is significantly slower.

nn.LSTM does that for you. you don;t need to run in a loop

Sorry Simon, would you mind explaining this? I thought the nn.LSTM and nn.RNN used the input tensor at the next sequence index as input, and only passed the hidden layer forward.