Is there any pytorch native alternative of sklearn KFold()?

SouthTorch · June 4, 2022, 10:35am

Many people use the following code to kfold split the dataset:

from sklearn.model_selection import KFold
kfold = KFold(n_splits=k_folds, shuffle=True)
for fold, (train_ids, test_ids) in enumerate(kfold.split(dataset)):

I just want to know other pytorch native ways. Thanks. Just feel not to import too much stuff.

A follow up question:
If I want to do some machine learning stuff instead of deep learning, do I need to learn sklearn? Does pytorch include those modules? Thanks.

ptrblck · June 4, 2022, 10:15pm

I;m not aware of a native PyTorch implementation of KFold and would generally recommend to use implemented and well tested modules (in this case from sklearn) instead of reimplementing the same functionality (and potentially hitting bugs) unless you have a strong reason to do so.

It depends which models and use cases you would like to work on, but sklearn is certainly a good library to be familiar with.

Some of them should be available (their neural network classes), but other modules (e.g. from the random tree classes) might not be available natively (you might find some PyTorch ports on GitHub, but I haven’t checked it).

SouthTorch · June 5, 2022, 3:37am

Thanks, ptrblck!