You wouldn’t need to create a dataframe and split it, but could use train_test_split
on the indices directly.
Once you have the training, validation, and test indices, you could then create Subset
s by using the Dataset
with the corresponding indices.
I think this is the cleanest approach, as it wouldn’t try to reimplement already working methods such as sklearn's
train_test_split
.
1 Like