What is the best way to apply k-fold cross validation in CNN?


How can I apply k-fold cross validation with CNN. I do not want to make it manually; for example, in leave one out, I might remove one item from the training set and train the network then apply testing with the removed item. Could you please help me to make this in a standard way.



Have a look at Skorch. It’s a scikit-learn compatible wrapper for PyTorch.
scikit itself offers a lot of cross-validation methods. :wink:


Thank you. This helped me a lot. Another question, How can I reset the network across the folds? to avoid data leakage.

1 Like

You could re-initialize the weights of the model.

def weights_init(m):
    if isinstance(m, nn.Conv2d):


I haven’t used skorch yet, but the model reset should be implemented somewhere.

1 Like

It seems a call to .fit re-initializes the model when warm_start is set to false. The model reset should therefore be performed automatically in grid searches.

Sorry, I am not able to make it. What is m and how can I import xavier.

m is each nn.Module or your model.
You can specify different initializers for each type of layer with the mentioned condition.

You will find all initializers in torch.nn.init.



I have developed a function to perform cross-validation in medical images, but it is easily adaptable to other types of problems.



@ptrblck, thanks for the clue of using skorch. I am new to skorch and pytorch. My understanding is that skorch is a pytorch wrapper for sklearn. I found sklearn seems not to support CNN by searching online. For example, cross_val_score need sklearn estimators. sklearn model_selection fit function need X: array-like, shape = [n_samples, n_features] instead of images. Please help me understand how skorch can help cross validation for CNN. Thanks.

Skorch tries to add exactly the missing compatibility of scikit-learn for PyTorch.
The docs have some good examples.
E.g. using the NeuralNet class you can call methods like fit and predict using PyTorch models in the background.

@ptrblck Is there specific way to put our X, y data (generated as training and validation Loaders already) in the grid search fit method directly? For instance what is the preferred X, y input for the fit function in this scenario? It fails when I Provide the generated training data Loader (images directly from directory) in the grid search fit function. some linky examples if there are, will be quite helpful. Thanks

A question for cross-validation.
Firstly, we divide all the data into training samples and test samples, such as the proportion of 80% and 20%.
Then, we divide the training samples into five groups, four of which used as train data (64%) and one group used as validate data (16%). The 5-fold cross-validation can be carried out to find the suitable parameters of the CNN.
So after the 5-fold cross validation, what should we do next for the testing samples? Should we use the parameters of CNN that determined in the cross validation stage, to train the network again on the 80% samples, and test on the 20% samples?
I am a little confused. Thank you.

Based on the .fit() docs the fallback should be skorch.dataset.Dataset in case all other inputs won’t work:

If this doesn’t work with your data, you have to pass a Dataset that can deal with the data.

@rasbt explains CV beautifully in his blog post as well as his lecture notes. It’s always my source for a quick reassurance if I’m in doubt. :wink:


Thanks for the reply @ptrblck. It actually works with my dataset (dataSet from ImageFolder with transformations applied), but in the forward(X) method of the neural net, It passes a List of X, y tensors, instead of just X tensors, which is weird. I do Grid Search with Skorch, and here is the sample code:

start = time.time()
with torch.no_grad():

    net = NeuralNet(model, criterion= nn.CrossEntropyLoss,
                max_epochs = args.epochs,
                lr = args.learning_rate,
                batch_size = 32,
                optimizer_momentum = 0.09,
                iterator_train__shuffle = True,
                iterator_train__num_workers = 4,
                iterator_valid__shuffle = True,
                iterator_valid__num_workers = 4,
                train_split= predefined_split(valid_dataset),
                callbacks= [ lr_scheduler, epoch_acc, checkpoint],
                device = 'cuda')

#pipe = Pipeline([('scale',StandardScaler()),
#                 ('net',net)])
params = {
        'module_num_units': [10,20],
gs = GridSearchCV(net, params,refit=False, cv=5, scoring='accuracy')

gs.fit(train_dataset, y=None)
end = time.time()
print("Total training time: "+ str(end - start))

Is there a specific way I need to specify to send just the corresponding X tensor in my forward function? Thanks

The skorch.dataset.Dataset implementation is not the same as torch.utils.data.Dataset as described here. Could you try to wrap your data in the skorch implementation and pass it to the fit function?

1 Like

:+1: Thank you very much!

1 Like

An example of such transformation, for instance for MNist dataset would be very usefull.
Thank you in advance!

Which transformation do you mean?

IIt is unclear for me yet of how to use Pytorchs DataLoader and Sklearns Cross validation? All I can find in Inet is: it is impossible, Skorch`s explanation is not perspicuously for me.
Could you give an advise of wrapping DataLoader in Skorch in this way?
Thank you a lot in advance!