Split dataset (always same images)

Hi.

I would like to split my dataset between training and test, but I don’t want it to be done randomly, like:

torch.utils.data.random_split(dataset, [train_size, val_size])

This is because I want to perform several trainings under the same conditions (test images always the same in each training).

Is there a pytorch function that does it?

Thank you very much!

You can wrap your dataset into a Subset and pass the corresponding indices to them to create the training and validation splits.

1 Like

I am sorry but, how can I put this into code?

Here is an example:

dataset = TensorDataset(torch.arange(10))
train_dataset = torch.utils.data.Subset(dataset, indices=torch.arange(5))
val_dataset = torch.utils.data.Subset(dataset, indices=torch.arange(5, 10))

for d in train_dataset:
    print(d)
    
for d in val_dataset:
    print(d)
1 Like

Perfect! Thank you very much!