Beginner here so please bear with me. I’m adapting this LSTM tutorial to predict a time series instead of handwritten numbers.

In the original problem (using MNIST) there are 60000 28 * 28 images that are used to train the network. These get reshaped into a 28 * 60000 * 28 tensor to be ingested by the model.

My original data is a one dimensional time series with shape `(40000, )`

. With a batch size of 20 I reshape to `(5, 8000, 1)`

tensor corresponding to (timesteps, batches, features).

I’m trying to build an LSTM that takes 5 timesteps and predicts the “next” one, using a hidden layer of dimension 128 `(5 --> 128 --> 1)`

, but I’m getting a mismatch when I run the code. I can solve the problem but I don’t quite get what is going on.

I’m getting the following error:

`RuntimeError: size mismatch, m1: [20 x 1], m2: [5 x 512]`

`20`

is the batch size I defined

`1`

is the sequence length (number of features, only time-series itself at this point)

`5`

is the number of timesteps

`512`

is 128 * 4 but not sure where this comes from (why have 4 times the dimension of the hidden layer?)

So obviously if I change the sequence length to 5 it works, but I’m confused because then I would have an input tensor with shape (5, 1600, 5) and not the desired (5, 8000, 1).

The new shape doesn’t seem right because I want to take 5 data points to predict the 6th, so it should be a `5 x 1`

vector that maps to an scalar, not a `5 x 5`

grid that maps to a scalar (like the `28 x 28`

grid in the original MNIST code).

What am I not getting?

Thanks for any insight.