Why use torch dataset?

I understand that data.Dataset can be useful if we need to prepare our data (e.g., load it from disk, or do data augmentation). Is there any benefit to using Dataset if all of our data is already in memory and we are not doing augmentation?

We can use TensorDataset for on-memory data that some tensors store. That dataset is fed to DataLoader, which produces shuffled batches.

So it’s just so I don’t have to do my own shuffling?

Your own shuffling is not banned in pytorch.

Essentially it is, that you don’t have to bother yourself with shuffling, efficient batching (for various types of inputs) and pinning memory.

as @Tony-Y stated, you can always introduce your own methods, but you don’t have to.