Split dataset into two new datasets (NOT subset)

Hi all together,

is it possible to split a torch dataset into two new datasets (train and validate) with indices in the range of 0 to validation length and 0 to train length? Because the random_split method returns indices in the range of 0 to the length of whole dataset for both datasets.

Thank you for your help!

random_split returns two Datasets with the lengths as specified in the lengths argument:

dataset = TensorDataset(torch.randn(1000, 1))
train_data, val_data = random_split(dataset, [900, 100])
print(len(train_data))
> 900
print(len(val_data))
> 100

train_data[0] # works
train_data[899] # works
train_data[900] # error

val_data[0] # works
val_data[99] # works
val_data[100] # error

Also answered here.