Training speed is slow in seq2seq decoder due to for-loop

I am trainning a batched seq2seq using the code in https://github.com/spro/practical-pytorch/blob/master/seq2seq-translation/seq2seq-translation-batched.ipynb.

I found that decoder forward and backward speed is quite slow due to the for-loop as follow:

decoder_input = [SOS_token for _ in range(batch_size)]

for i in range(max_target_length):
    decoder_output =  train_decoder(decoder_input )
    decoder_input  = target[i]
    loss+=loss_function(decoder_output )
loss.backward()

how to speed up in this case? or not use for-loop?

I believe the issue is actually due to the for loop in the attention class. The following lines specifically:

   # Create variable to store attention energies
    attn_energies = Variable(torch.zeros(this_batch_size, max_len)) # B x S

    if USE_CUDA:
        attn_energies = attn_energies.cuda()

    # For each batch of encoder outputs
    for b in range(this_batch_size):
        # Calculate energy for each encoder output
        for i in range(max_len):
            attn_energies[b, i] = self.score(hidden[:, b], encoder_outputs[i, b].unsqueeze(0))

it is faster to do this calculation using torch functions. Something like attn_weights = torch.bmm(encoder_outputs, decoder_output). Though you’ll have to reshape encoder_outputs and decoder_output.

Hi, I have the same problem as you. Did you find a way to speed this up ? Thanks.

Using the for loop in decoder will decease the speed,
for i in range(max_target_length):

just refer the decoder here