When I wrap nn.DataParallel around my models that contain LSTMs I keep getting runtime errors when packing my padded sequences because the sequence lengths list that I pass in is not automatically split across the batch size.

`

autoencoder = nn.DataParallel(autoencoder, dim=0)
...
# Method of autoencoder called in forward
def encode(self, indices, lengths):
embeddings = self.embedding(indices)
# error happens during packing
packed_embeddings = pack_padded_sequence(input=embeddings,
lengths=lengths,
batch_first=True)
# Encode
packed_output, state = self.encoder(packed_embeddings)
hidden, cell = state

`

For example, for batch size 64, the word index tensors I pass in are split across the batch dimension (32 each), but the sequence lengths list that I pass into pack_padded_sequence is still of length 64. I am using batch dimension as the first dimension (dimension 0).

I’ve seen several posts related to this issue, but haven’t found a definitive solution for this. Do I have to make the sequence lengths list a tensor? Or do I have to make the batch dimension the second dimension (dimension 1)? Any suggestions?

Maybe you can wrap the lengths list in a LongTensor variable, then use .cuda() method to split it across multiple GPUs, the in the forward() method, cast the LongTensor back to list. This works for my problem, though a bit unefficient.

hello, can you give me represents in details? i am also confused by the same problem with you in the DataParallel for RNN(using packing).Any help will be appreciated,

Assume (x,len) is the input to your packed RNN, (where x is a (batch_size, max_len, d_embed) tensor, len is a list of length batch_size), then you can use (x.cuda(), torch.LongTensor(len).cuda()) to distribute them on GPUs, and then in the model (which have already distributed on GPUs), use (x,list(len)) as the input. It’s a straightforward and inefficient method… Am I clear ?