Hello guys,
I’m wondering if there’s some standardized/best practice way of both storing and loading data from a multi-label dataset. For my use case, I’m generating data from an audio dataset, in which I cut up audio-tracks into spectrograms as input features, and each input is multi labeled with 18 different classes.
After preprocessing one audio-track I get a (for example) (33 x 96 x 86) numpy array of 33 inputs and a corresponding (33 x 18) numpy array with labels.
Right now I’m storing everything like:
└── Preprocessed
├── Track1
│ ├── labels.npy
│ └── spectrograms.npy
├── Track2
│ ├── labels.npy
│ └── spectrograms.npy
└── Track3
├── labels.npy
└── spectrograms.npy
The reason for this is so that I can keep track of which track the different input comes from, in case I want to balance it somehow in a train/test split.
However, it feels like kind of a hussle to build a Dataset
and corresponding __getitem__
in PyTorch using this structure, so I’m wondering if there is some smarter standardized way to structure and load multi-labeled data?