[SOLVED] DataParallel error: gather got an input of invalid size:

Hi,

I try to train a model on several GPUs with DataParallel which uses several RNNs inside (pack_padded_sequence and pad_packed_sequence), but I get this error:

ValueError: gather got an input of invalid size: got 10x13x29, but expected 10x14x29

Each RNN is constructed with batch_first=True flag.
Pseudocode of my model is like this.

class Model:

    def forward(x, lenghts)
        for rnn in self.rrns:
            x = rnn(x, lengths)

class RNN:

    def forward(x, length):
        #...
        x = pack_padded_sequence(x, output_lengths, batch_first=True)
        x, _ = self.rnn(x)
        x, _ = rnn.pad_packed_sequence(x, batch_first=True)

If I don’t use pack_padded_sequence everything works great. What could be a problem? Would appreciate any hints/directions.

1 Like

Refer to the FAQ in PyTorch regarding using total lengths. There’s a subtlety in using the pack_padded_sequence and pad_packed_sequence utility functions with DataParallel.
https://pytorch.org/docs/stable/notes/faq.html

2 Likes

Thank you very much!