Hey, I have more than one language modality in my model. To use pack padded sequence you need to sort by the length of the question. But obviously the two modalities have different length order. Now I pass each one of them separately in an LSTM model, and concat the output afterwords. That means I need to return to the original order.
Does it makes sense to sort by length, and transform to original order after the LSTM? I’m worrying I’ll mess something with the gradients.