I have ~0.5 million files from which I am extracting features as torch.tensors with dimensions of approximately 100x100x100. Prrecomputing features for all files doesn’t seem like an option because I don’t have enough storage space. On the other hand, it takes ~10s to calculate the features per file so I’m worried this will be a bottleneck in training if I don’t precompute them.
I would be grateful if anyone would be willing to suggest what would be the standard/best practice in this case or point me to a relevant source?
thank you!