[Solved] Minibatching with LSTMs


I am unable to understand why LSTMs hidden parameters have the size of a minibatch as a dimension. Shouldn’t the model be independent of the size of minibatch which is an attribute of the training process.

I am especially facing this issue while testing. The model expects me to still do the forward pass on a minibatch of certain size while I want to be able to make individual predictions.

Figured it out. Confused the hidden state with parameters.