This seems like an easy question to solve, however, I did not find a working solution for it yet.
I have a very big time series dataset, which if put into one single dataloader will exceed the GPUs memory by far!
My general idea is to have a double for loop. First loop over the DataFrame, take a part of it, transform it into a dataloader and pass it into the second loop to run through as set set of epochs (and then a third for the batches), training and validating one and the same model.
If this works, the question occurs how to train the model. First I thought calling trainer.fit() multiple times, however, this seems not to work. Then I thought about writing a custom optimization loop in the classical PyTorch approach. However, the model I am using is from PyTroch Forecasting and runs on PyTorch Lightning, so I need to use the Lightning API. How do I do that?
Can I solve this by writing a custom training_step(batch, batch_idx, data_loader_idx)? But if doing so, how can I correctly implement it to run on multiple dataloaders?
FYI: My DataFrame is so large, that manually splitting it into dataloader_i
does not work. The For-Loop over the DataFrame is manditory.