I want to use cifar 10 dataset from torchvision.datasets to do classification.
import torch import torchvision import torchvision.transforms as transforms from torch.utils.data.sampler import SubsetRandomSampler transform = transforms.Compose( [transforms.ToTensor(), transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))]) trainset = torchvision.datasets.CIFAR10(root='./data', train=True, download=True, transform=transform)
the detail in trainset is :
type(trainset) # tuple type(trainset) # torch.Tensor (image (3,32,32)) type(trainset) # int (label 0~9)
and then I would like to split train data to training data and validation data like 5-fold
how can I split dataset by labels (not split randomly) ?
i.e. 50000 image data, 10 labels, each label has 5000 images
cross validation (5-fold): 40000 training data, each label has 4000 images
10000 validation data, each label has 1000 images
Thanks in advance.