I created a class that inherits from torch.utils.data.Dataset
and split it using torch.utils.data.random_split(dataset, [TRAIN_SIZE, TEST_SIZE])
.
Afterwards I load it via the torch.utils.data.DataLoader
interface and finally save it using torch.save to the hard drive.
dataset = DataSetClass(df=DATA, transform=transform, device, window=3)
DATASET_LENGTH = len(dataset)
TRAIN_SIZE = int(DATASET_LENGTH * 0.75)
TEST_SIZE = DATASET_LENGTH - TRAIN_SIZE
train_dataset, test_dataset = random_split(dataset, [TRAIN_SIZE, TEST_SIZE])
dataloader_train = DataLoader(dataset=train_dataset, batch_size=1, shuffle=False, num_workers=4)
dataloader_test = DataLoader(dataset=test_dataset, batch_size=1, shuffle=False, num_workers=4)
torch.save(dataloader_train, "/path/to/train.pt")
torch.save(dataloader_test, "/path/to/test.pt")
However the saved files are of the same size on the hard drive, why is that?