Alex, thanks for the reply. I read the post you provided and I am confused about the shape.
“So if you divide a time series of length 10000 into chunks of length 50, your input tensor would be 50 (timesteps) by 200 (batch size) by 1 (features).”
"There are 200 batches in the dataset; each batch is 50x200x1."
For my understanding, so if we train only one batch? because 50*200=10000, 10000 is the total number of the data