LSTM Input Shape Query

I am new to LSTM and PyTorch’s implementation of LSTM using torch.nn.LSTM() has confused me further.
I am implementing an LSTM model for predicting the speeds of different frames (

The paper does not mention much about the LSTM part of the model. All it says is that I have to consider the “speeds of 10 previous timestamps”. Also, in the image of the architecture, they have mentioned “LSTM 128”.

What should my input_size, hidden_size and num_layers be? What is “128” here?

Since I have essentially only one feature, i.e., speed, I am guessing the input_size to be 1. Is this correct?

I can have the seq_len to be 10 (for 10 previous timestamps).

Is this correct or am I missing something here?

If that is correct, what will the hidden_size, batch_size and num_layers be?