Creating A Dataset from keras train_test_split

I have a dataset of images and then a continuous value. I’m using a CNN model to predict that value. There are 14,000 images and 14,000 values. I know in Keras I can use train_test_split to get X_train, y_train, X_test, and y_test then would use

but to train my model in pytorch, do I combine X_train and y_train into a dataset and use a DataLoader or is there another way? ie not using keras train_test_split

My code so far:
The images are 360x360 and grayscaled so the input shape is (1,360,360)

X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.33, random_state=42)

X_train.shape is  torch.Size([9814, 360, 360, 1])
y_train.shape is torch.Size([9814, 1])
X_test.shape is torch.Size([4834, 360, 360, 1])
y_test.shape is torch.Size([4834, 1])

training_set = torch.hstack((X_train,y_train))
validation_set = torch.hstack((X_test, y_test))

training_loader =, batch_size=64, shuffle=True, num_workers=2)

validation_loader =, batch_size=64, shuffle=False, num_workers=2)

Thank you in advance!

Typically, users create a Dataset object then use random_split to split into train/test sets.

Here’s an additional tutorial that you may find helpful.