I’m very new to PyTorch and my problem involves LSTMs with inputs of variable sizes.
Because each training example has a different size, what I’m trying to do is to write a custom collate_fn
to use with DataLoader to create mini-batches of my data. To my understanding, I’d need to implement my own collate_fn
and use pad_packed_sequence
somehow…
I have tried looking at examples online, but nobody seems to be doing it like me
The problem arises from the fact that collate_fn
gets passed the input batch
which is a list of tuples of (training_tensor, label)
and I can’t seem to figure out how to properly convert it to a suitable datatype and pass the tensors to pad_packed_sequence
.
What is the correct way of doing this?
A more general question: is there something better than DataLoader
that works nicely with LSTMs?