Save dataset into .pt file

I am taking MNIST data and performing some processing on it.
Instead of doing this processing every time the image is loaded, I want to just save it as a new dataset so that I can just directly read it the next time.

What is the proper way of saving a dataset? I can’t seem to find any examples.

3 Likes

You approach so store the data as tensors seems like a good idea!
If you save the data as tensors and load them in your next run, you could pass them into a TensorDataset.

2 Likes

How about using pickle library?

https://docs.python.org/3/library/pickle.html

Thanks a lot, can you share some example?
What should be done after calling:
torch.utils.data.TensorDataset(data)
Out[14]: <torch.utils.data.dataset.TensorDataset at 0x1d6c4522ef0>

Once you’ve loaded the tensors and created a TensorDataset, you could pass it to a DataLoader and start the training. Have a look at the Data loading tutorial for more information.

Thanks but how can I use it in a separate session?

You could just load the tensors again and create the Dataset in the same way in another script.