I am still a PyTorch noob. I want to do Incremental Learning and want to split my training dataset (Cifar-10) into 10 equal parts (or 5, 12, 20, …), each part with the same target distribution.
I already tried to do it with sklearn (train_test_split) but it only can split the data in half:
from sklearn.model_selection import train_test_split transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))]) trainset = torchvision.datasets.CIFAR10(root='./data', train=True, download=True, transform=transform) data1_idx, data2_idx= train_test_split( np.arange(len(targets)), test_size=0.5, shuffle=True, stratify=targets) data1_sampler = torch.utils.data.SubsetRandomSampler(data1_idx) data2_sampler = torch.utils.data.SubsetRandomSampler(data2_idx) data1_loader = torch.utils.data.DataLoader(trainset, batch_size=4, sampler=data1_sampler) data2_loader = torch.utils.data.DataLoader(trainset, batch_size=4, sampler=data2_sampler)
How would you do it in PyTorch? Maybe you can point me to some example code.