Save dataset into .pt file

himat · September 16, 2018, 4:53pm

I am taking MNIST data and performing some processing on it.
Instead of doing this processing every time the image is loaded, I want to just save it as a new dataset so that I can just directly read it the next time.

What is the proper way of saving a dataset? I can’t seem to find any examples.

ptrblck · September 16, 2018, 5:05pm

You approach so store the data as tensors seems like a good idea!
If you save the data as tensors and load them in your next run, you could pass them into a TensorDataset.

RobertYu · August 30, 2019, 2:36am

How about using pickle library?

https://docs.python.org/3/library/pickle.html

Frida · October 11, 2019, 8:41am

Thanks a lot, can you share some example?
What should be done after calling:
torch.utils.data.TensorDataset(data)
Out[14]: <torch.utils.data.dataset.TensorDataset at 0x1d6c4522ef0>

ptrblck · October 11, 2019, 3:38pm

Once you’ve loaded the tensors and created a TensorDataset, you could pass it to a DataLoader and start the training. Have a look at the Data loading tutorial for more information.

Frida · October 12, 2019, 7:30am

Thanks but how can I use it in a separate session?

ptrblck · October 12, 2019, 1:27pm

You could just load the tensors again and create the Dataset in the same way in another script.