Create custom dataset from tensors

Hello everyone. I am new to pytorch. I have a program that produce tensors and labels of them. I want to create a dataset (perhaps a .pt file) and use it for training. But how should i do it? How every tensor match its label? Thanks a lot.

In case you have already created the data and target tensors, you could use torch.utils.data.TensorDataset to create the dataset.

You, as the creator of the dataset, would have to make sure that each create data sample matches its target.

Thank you very much. I create two empty list and in an iterative procedure append every data to first list and its target to second. Say, features=[tensor_1, tensor_2, ā€¦] and targets=[label_of_tensor_1, label_of_tensor_2, ā€¦]. Then I use TensorDataset(features, targets). Is this a correct way? I give an Error: ā€œvalueerror: only one element tensors can be converted to python scalarsā€! Can you help me more?

Yes, this is generally the right approach.
Could you check, if all the data and target tensors have the same shape, and thus a tensor creation via:

features = [tensor1, tensor2, ...]
features = torch.stack(features)

would work?
I guess the new error is raised because of unexpected shapes, but am unsure which operation raises it.

Iā€™m very sorry (In previous reply I writed error that arised after some additional code) Error is This: ā€œAttributeError: ā€˜listā€™ object has no attribute 'sizeā€. And simple code that write for examination:
My code:
my_x = [torch.rand(2,2),torch.rand(2,2)] # list of tensors
my_y = [torch.rand(1), torch.rand(1)] # list of targets
my_dataset = TensorDataset(my_x,my_y).

I apologize if my questions are childish. Thanks a lot.

Ah OK, that would fit my expectation of the error and this would work:

my_x = torch.stack([torch.rand(2,2),torch.rand(2,2)])
my_y = torch.stack([torch.rand(1), torch.rand(1)])
my_dataset = torch.utils.data.TensorDataset(my_x,my_y)

The error is raised, because tensors are expected, while you are passing lists.

1 Like

Thank you very much. It works. And for final question, I would to save my_x and my_y for further use. (or perhaps save my list of tensors and targets that was created). How to save and load them after? (If I do not save them, I will have to create two large of lists with append and it takes a long time).

You can save and load tensors via torch.save(tensor, path) and torch.load(tensor, path), respectively. This would allow you to load these tensors afterwards and create the TensorDataset directly without creating the lists.

2 Likes