I have a dataset that for each sample, it is a 365 * 3000 pandas/numpy array after preprocessing. And they are sparse. The label or target is binary for each sample. I have around 500K of these samples. Apparently I can not load the the 500K * 365 * 3000 data in the memory. So my plan is to save the numpy array to disk like an image and load them back during training using batch, so I will be fine on the memory side. However, I think it’s waste of time by doing the I/O. Do you think is there better solution out there? If I have to save the file, will the torch.sparse help?