Custom dataloader to load 2 tensordatasets simultaneously

Hi,

I would like to convert:

dataloader_data1=torch.load(‘data1.pt’)
dataloader_1 = torch.utils.data.DataLoader(dataloader_data1, batch_size=BATCHSIZE_G,
shuffle=True, num_workers=workers)

dataloader_data2=torch.load(‘data2.pt’)
dataloader_2 = torch.utils.data.DataLoader(dataloader_data2, batch_size=BATCHSIZE,
shuffle=True, num_workers=workers)

to allow me to load both tensordatasets into a single dataloader:

class LoadDataset(Dataset):

def __init__(self):
    self.data_1 = torch.load('data1.pt')
    self.data_2 = torch.load('data2.pt')


def __getitem__(self, index):
    return self.data_1, self.data_2

def __len__(self):
    return min(len(self.data_1), len(self.data_2))

temp_data=LoadDataset()
dataloader = torch.utils.data.DataLoader(temp_data, batch_size=BATCHSIZE,
shuffle=True, num_workers=workers)
This goes some way as it creates the tuple, however when i try and unpack it using

for i, (data,quant) in enumerate(dataloader):

I get:

TypeError: default_collate: batch must contain tensors, numpy arrays, numbers, dicts or lists; found <class ‘torch.utils.data.dataset.TensorDataset’>

I should add that data1 and data2 are of different lengths and also contain 3 fields each.

I would like to load the model with fields 1 & 2 from the data 1 & data 2 in the same loop.

can anyone help please?

chaslie

SOLVED - FORGOT TO ADD index BEHIND SELF.DATA1 & SELF.DATA2