I am working on a machine translation dataset and the input entries are sorted from the shortest sequence to the longest sequence. I pad them to the max length in each batch using collate_fn.
Is there a way to make batch creation unshuffled so that similar length entries end up in the same batch (since the data is sorted by length), but still shuffle the retrieval of batches themselves?