I have several CSV with temporal data files and from each CSV files I want to get minibatches (i.e. size 200) of sequential data. How can I incorporate all this CSV files in a PyTorch dataset while not mixing information from one CSV into another?
You could store all file paths to the .csv files in your
__init__ method and load each file separately using a module operation in
E.g. if file1 contains 100 sliding windows, you could check, if
100 <= index < 200 and load the window from the second file.
Have a look at this small example for a sliding window approach.