I am training an LSTM and the batch of data of the last iteration is much less than the batch size I specified to create the lstm model. Is there a way to overcome this somehow, or should the batch size always be an exact multiple of the dataset size?
Either you guarantee that all batches have indeed the same size, or you adjust the hidden state accordingly for each batch. I assume that in your training code you re-initialize the hidden state with zeros (a common practice). You often find a method called
init_hidden() in example code. You can simply extend this method to
init_hidden(batch_size) and return the zero-ed hidden state with the respective dimensions.
Yes you are right! I do re-initialize the hidden state for every batch. Thank you very much for your reply!!