I’m training a network to perform time-series forecasting. My current implementation works but is looping through my sequences 1 at a time, yielding a parameter update per input sequence per epoch. I want to mini-batch train so there is a parameter update after evaluating 32 sequences (to hopefully improve training time and reduce noise) but I can’t figure out how to implement the DataLoader / training loop to do this. I would like to use the DataLoader so I can shuffle the training data.
NOTE: I am not talking about mini-batching the sequences themselves. Each LSTM input is still a tensor of size [100, 1, 8] representing [sequence length, sequence mini-batch size, number of features]. I want to train on 32 of these LSTM inputs per parameter update.
Maybe I’m misunderstanding something or I’m missing a simple solution.
Any help is greatly appreciated!