PackedSequence cannot be used with DataParallel

(Yi Han) #1

I need to used the pad_sequence() and pack_padded_sequence() module due to variable length input. However, when I combined this with nn.DataParallel, when input data is distributed to multiple GPUs, this PackedSequence object is deformed into a tuple with the first element being while the second being PackedSequence.batch_sizes. This parallel module divides only equally with respect to the number of GPUs but replicates PackedSequence.batch_sizes for each GPU. This causes decoding the packed sequence by batch_sizes impossible. I’m wondering if this means PackedSequence object currently is not compatible with nn.DataParallel, or there is a workaround solution? Thank you.

(Novak) #2

There is an excellent workaround described here.
I implemented something very similar after reading it, and it worked just fine.

(Yi Han) #3

Great, I’ll take a look, thanks!