When I wrap nn.DataParallel around my models that contain LSTMs I keep getting runtime errors when packing my padded sequences because the sequence lengths list that I pass in is not automatically split across the batch size. ` autoencoder = nn.DataParallel(autoencoder, dim=0) ... # …

Hi, Kelly, is your problem with DataParallel and packed sequence solved? I encountered the same question of being out of range…

Maybe you can wrap the lengths list in a LongTensor variable, then use .cuda() method to split it across multiple GPUs, the in the forward() method, cast the LongTensor back to list. This works for my problem, though a bit unefficient.

hello, can you help me? i am also confused by the same problem with you in the DataParallel for RNN(using packing).Any help will be appreciated

[image] gdp: for hello, can you give me represents in details? i am also confused by the same problem with you in the DataParallel for RNN(using packing).Any help will be appreciated,

Assume (x,len) is the input to your packed RNN, (where x is a (batch_size, max_len, d_embed) tensor, len is a list of length batch_size), then you can use (x.cuda(), torch.LongTensor(len).cuda()) to distribute them on GPUs, and then in the model (which have already distributed on GPUs), use (x,list(…

This answer might be helpful, [image] Multi layer RNN with DataParallel Here is one solution for using nn.DataParallel I find works well. It can return both rnn output and hidden states from your module, using batch_first = False mode (which is a popular mode). Use bat…

DataParallel for RNN (using packing)

fashandge December 7, 2018, 4:39am 9

This answer might be helpful,