LSTM batch size parameter and dataloader batch size is not the same

se_ti · March 29, 2019, 9:48am

I am training an LSTM and the batch of data of the last iteration is much less than the batch size I specified to create the lstm model. Is there a way to overcome this somehow, or should the batch size always be an exact multiple of the dataset size?

vdw · March 29, 2019, 9:54am

Either you guarantee that all batches have indeed the same size, or you adjust the hidden state accordingly for each batch. I assume that in your training code you re-initialize the hidden state with zeros (a common practice). You often find a method called init_hidden() in example code. You can simply extend this method to init_hidden(batch_size) and return the zero-ed hidden state with the respective dimensions.

se_ti · March 29, 2019, 10:01am

Yes you are right! I do re-initialize the hidden state for every batch. Thank you very much for your reply!!
It worked!