PackedSequence cannot be used with DataParallel

yihan0512 · August 9, 2018, 11:20pm

I need to used the pad_sequence() and pack_padded_sequence() module due to variable length input. However, when I combined this with nn.DataParallel, when input data is distributed to multiple GPUs, this PackedSequence object is deformed into a tuple with the first element being PackedSequence.data while the second being PackedSequence.batch_sizes. This parallel module divides only PackedSequence.data equally with respect to the number of GPUs but replicates PackedSequence.batch_sizes for each GPU. This causes decoding the packed sequence by batch_sizes impossible. I’m wondering if this means PackedSequence object currently is not compatible with nn.DataParallel, or there is a workaround solution? Thank you.

Novak · August 10, 2018, 5:19am

There is an excellent workaround described here.
I implemented something very similar after reading it, and it worked just fine.

yihan0512 · August 10, 2018, 5:40am

Great, I’ll take a look, thanks!