Pandas DataFrame.groupby in Dataset/Data loader

Hello all,

I was thinking about creating a general data loader for 3D imaging inputs (volumes). I want to eventually have a lot of flexibility for how I load the images (i.e. what percentage of the volume I’m inputting), and control for sampling-based (and metrics) based on certain attributes which could easily be stored in a DataFrame.

I was considering using the pandas.DataFrame.groupby object as a core of the data loader. It was would be relatively simple to initialize the core parameters that define the image coordinates of interest.

Is this a good strategy? Alternatively, I could have a custom data structure that maps across the image volumes and the spatial coordinates.

Any feedback is appreciated before I dive into this :slight_smile: