for subset not working

Regarding text data, is working well with dataset, however, in order to split the dataset to train and validate, will be used, but it returns subset not dataset.

so this will trigger a problem, if we use to split a dataset to train and validate, how can we still use to generate train_loader and validate_loader from the subsets?

Thank you very much.

AttributeError: 'Subset' object has no attribute 'sort_key'

the above error will show if apply to subset. Since subset is not dataset, how can we easily generate loader from subset? thank you very much.

found a solution: dataset itself has a method called split, it can split the dataset by ratio, which can solve the problem of split dataset to train and validation.

The split method of dataset class will return dataset, NOT subset, which is different from

However, this will create a puzzle: when to use what is subset for?

Hope the solution helps people with similar problems. thanks.

Thanks @awsgcp. Indeed, I think torchtext has some duplicate code, which should be retired. See an issue post here where we discuss a new abstraction, which is more compatible with In that case, we don’t need to maintain those duplicate functions anymore.