DataLoader creating small batches with leftovers that cause errors

I have a validation set that is 1302 examples long. I create a DataLoader object from the data set and pass my desired batch size, which is 50.

During training, I iterate over the DataLoader, which provides a batch of size 50 and the associated labels. However, the last batch causes an error to be throw at the nn.LSTM layer because it expects a hidden[0] size of (a, 50, b) but instead received size (a, 2, b).

It appears DataLoaders provides a final batch with the leftover 2 examples.

Are there any solutions, possibly specific to the LSTM class, that do not involve either throwing out those last two examples or finding a different batch size the data can be equally divided into?

My training set size is a prime number, so I cannot find a number they are both divisible by, and I want to avoid throwing out data as much as possible.


You could initialize the LSTM states using the currently incoming input shape instead of the global batch size.
If that’s not possible for some reason, you would have to remove these samples using drop_last=True in the DataLoader.

1 Like