When iterater over one huge data file, how to load next huge data file?

Hello, I have ten python pickle files, such 1.pkl ~ 10.pkl, every pickle file larger than 10GB,
whth the RAM limited, How can I:
Load 1.pkl in dataset,training complete, then
Load 2.pkl in dataset, continue training

Load 10.pkl in dataset, continue training
Epoch end

How can I achieve this? Thank you

You could implement exactly the logic you’ve described:

  • create a custom Dataset accepting a path to the pkl file
  • use a loop iterating all 10 paths
  • create a new Dataset using the current path
  • train your model inside the loop
1 Like

OK, Thank U very much, I will try it