Looping over DataLoader Returns List?

Hi,

I have a PyTorch tensor and use torch.utils.data.TensorDataset to make it a torch.utils.data.Dataset object. Then I throw it into a DataLoader and loop over it, yet the data I obtain is a list, but a tensor, and I don’t get yet why. Here an MWE:

import torch 

data_geant_eval = torch.ones(256)
test_loader = torch.utils.data.DataLoader(
    dataset=torch.utils.data.TensorDataset(data_geant_eval),
    batch_size=256, 
)
for (idx, data) in enumerate(test_loader): 
    print(type(data))

Output:

<class 'list'>

Is that expected behavior?

Yes, this is expected since TensorDataset.__getitem__ returns a tuple as seen here.

Thank you for the answer! This is the _getitem__ function of TensorDataset:

def __getitem__(self, index):
        return tuple(tensor[index] for tensor in self.tensors)

It’s supposed to return a tuple, so I’m a bit confused on why my MWE returns the type of a list.

The default_collate function of your DataLoader is creating the list:

dataset = torch.utils.data.TensorDataset(data_geant_eval)
print(type(dataset[0]))
# <class 'tuple'>

torch.utils.data._utils.collate.default_collate([dataset[0], dataset[1]])
# [tensor([1., 1.])]

while the dataset does return the tuple.
You can find the code here.