I need to used the pad_sequence() and pack_padded_sequence() module due to variable length input. However, when I combined this with nn.DataParallel, when input data is distributed to multiple GPUs, this PackedSequence object is deformed into a tuple with the first element being PackedSequence.data while the second being PackedSequence.batch_sizes. This parallel module divides only PackedSequence.data equally with respect to the number of GPUs but replicates PackedSequence.batch_sizes for each GPU. This causes decoding the packed sequence by batch_sizes impossible. I’m wondering if this means PackedSequence object currently is not compatible with nn.DataParallel, or there is a workaround solution? Thank you.
yihan0512 (Yi Han) #1
Novak (Novak) #2
There is an excellent workaround described here.
I implemented something very similar after reading it, and it worked just fine.
yihan0512 (Yi Han) #3
Great, I’ll take a look, thanks!