How to best process/load time series data for regression?

Ahhh okay, I see what you mean. So, calculate how many chunks I want the dataset to be able to return, then do some index wrangling in def __getitem__() to ensure I’m taking the desired slice of the data. That sounds very do-able.

At the moment I’m only working with a small subset of my dataset just to get everything up and running. But the entire dataset is stored in a single HDF5 dataset/array, so indexing should be fairly straight forward. The actual dataset consists of 64 x 10 minute recordings.

I’m still pretty new to Pytorch so still getting to grips with how customizable stuff like Datasets are. Thanks very much!