Hello,
I wish to use sklearn’s train_test_split
to create a validation set from the train set. I am loss on the next steps.
# Load datasets
train_set = torchvision.datasets.CIFAR10(
root='./data', train=True, transform=transform['train'])
# Create dataloader
train_loader = torch.utils.data.DataLoader(
train_set, batch_size=64, shuffle=True)
# Turn dataloader into iterator
images, labels = next(iter(train_loader))
# Convert image to numpy
images_np = images.to('cpu').numpy()
labels_np = labels.to('cpu').numpy()
# Split validation data from train set
X_train, X_test, y_train, y_test = train_test_split(
images_np, labels_np, test_size=0.2, random_state=42, shuffle=True, stratify=labels_np)
After splitting the dataset, how do I combine them back to feed into the dataloader?
I also notice that I am converting the images based on their batch size from next(iter()). How can I convert everything at one go?