I wish to use sklearn’s
train_test_split to create a validation set from the train set. I am loss on the next steps.
# Load datasets train_set = torchvision.datasets.CIFAR10( root='./data', train=True, transform=transform['train']) # Create dataloader train_loader = torch.utils.data.DataLoader( train_set, batch_size=64, shuffle=True) # Turn dataloader into iterator images, labels = next(iter(train_loader)) # Convert image to numpy images_np = images.to('cpu').numpy() labels_np = labels.to('cpu').numpy() # Split validation data from train set X_train, X_test, y_train, y_test = train_test_split( images_np, labels_np, test_size=0.2, random_state=42, shuffle=True, stratify=labels_np)
After splitting the dataset, how do I combine them back to feed into the dataloader?
I also notice that I am converting the images based on their batch size from next(iter()). How can I convert everything at one go?