Question on training LSTM to play Snake


I follow the DQN tutorial and trained a CNN to play a game, now I want to switch to LSTM but got a problem.

I found the tutorial for LSTM, and it recommends using (time sequence, Batch size, features). In my code, my input to LSTM is (4,32,100), where 4 means 4 consecutive frames, 32 is the batch size, 100 is a vector representing the current state.

Then I add a nn.Linear layer after LSTM, and the input size of linear layer is 432lstm_output_size, and here comes the problem. In trainig, the batch size is 32->(4,32,100), but in testing, the batch size is1 -> (4,1,100), which will cause an error.

I tried to train the LSTM with batch size 1, but it will take significantly longer time, is there any way that can let me train the LSTM with batch=32 and do inference with batch=1?

The number seems to be formatted a bit wrong, but I understand that the input features to the linear layer are defined as seq_len * batch_size * lstm_output_size?
You shouldn’t use the batch size as the number of input features, as this will limit your use case to only this particular size, and assumes that the samples in the batch are somehow related to each other.

Instead, permute the LSTM output so that the batch size is at dim0 and reset the in_features to e.g. seq_len * lstm_output_size.
This will allow you to use arbitrary batch sizes.

Alternatively, you could also use batch_first=True when you are creating an instance of LSTM, which will accept and return the batch size in dim0.