I have a list of files that contains equal length data. To simplify let’s say the two files contain
file1: [1, 2, 3, 4, 5]
file2: [6, 7, 8, 9, 10]
Currently I create a dataset for each file and use ConcatDataset() in my dataloader. Let’s assume a sequence length of 2.
Now if I use a batch size of 2, will the dataloader return [[1, 2], [3, 4]] or [[1, 2], [6, 7]]?
My desired batch is [[1, 2], [6, 7]], so that I can feed the LSTM with proper start point and the next batch [[3, 4], [8, 9]] will make sense for the LSTM state.
How can I achieve that?