I try to train a model on several GPUs with DataParallel which uses several RNNs inside (pack_padded_sequence and pad_packed_sequence), but I get this error:
ValueError: gather got an input of invalid size: got 10x13x29, but expected 10x14x29
Each RNN is constructed with batch_first=True flag.
Pseudocode of my model is like this.
class Model: def forward(x, lenghts) for rnn in self.rrns: x = rnn(x, lengths) class RNN: def forward(x, length): #... x = pack_padded_sequence(x, output_lengths, batch_first=True) x, _ = self.rnn(x) x, _ = rnn.pad_packed_sequence(x, batch_first=True)
If I don’t use pack_padded_sequence everything works great. What could be a problem? Would appreciate any hints/directions.