DataLoader not returning proper batch

Hi! I need help on DataLoader to resolve what is probably a simple misconception. Below, I have a MWE illustrating my use case. I input a list of three arrays of size 1024 to a custom Dataset, which will return element idx from each array as a sequence of three elements. I have checked that Dataset by iterating through it and confirmed that I could retrieve elements 0 through 1023. When using a DataLoader with batch_size=100, I expect the DataLoader to return batches of size 100. However, I only get one batch of size 1. I thought I understood how the DataLoader was supposed to work, but obviously not. Any insight is appreciated. Thanks.

import numpy as np
import torch
from import Dataset, DataLoader

# Create 3 arrays of size 1024
a = np.random.randn(3, 1024)
print(a.shape) # 3,1024
a1, a2, a3 = a[0], a[1], a[2]
print(a1.shape) # 1024

class myDataset(Dataset):
    A list of numpy arrays
    def __init__(self, data):
        assert isinstance(data, list), "myDataset: argument must be of type list" = data
    def __getitem__(self, idx):
        return tuple(data[idx] for data in
    def __len__(self):
        return len(

data = myDataset([a1, a2, a3])
data_iter = DataLoader(data, batch_size=100, shuffle=False)

for index, values in enumerate(data_iter):
    print("index= ", index)
    print("values= ", values)

# output of the for loop: 
# index=  0
# values=  [tensor([-0.4421, -0.4562,  1.2012], dtype=torch.float64), tensor([-0.8228, -0.7304,  # 0.6380], dtype=torch.float64), tensor([ 1.2241,  0.4840, -0.0031], dtype=torch.float64)]

# I expected to collect about 10 batches of size 100. 
# However, I only collect the equivalent of a batch of size 1, and the for loop has only a single iteration. 

Thanks for the great MWE!
The issue is a bit tricky to find, since iterating the Dataset works while the DataLoader only returns a single batch. However, after checking the implementation you can see that len(data) returns 3 and is then used by the DataLoader to define the used indices to create the single batch.
In your Dataset.__len__ function you are returning len( which is the length of the list of the arrays (and thus 3).
Use return len([0]) (or the min of all arrays in case these are different in your real use case) and it will work.

Thanks for the solution! This works. An alternative solution is to use TensorDataset.