Variable size of batches for training?

Hi, I have a training set which I want to divide into batches of variable sizes based on the index list (batch 1 would contain data with index 1 to 100, and batch 2 contains index 101 to 129, batch 3 contains index 130 to 135, …, for instance). I check dataloader but it seems to me that it only supports fixed-size batches. I wonder what would be a good way to do that?

Thank you!

Why don’t you shuffle your data and drop the last samples?

Because I want to keep the order fixed, such that a specific batch contains data exactly specified by the index list. For my example above, batch 1 should only contain data with index 1 to 100, not 100 random data points. Same for batch 2,3,…

Do you know these lengths beforehand?
If so, you could use these indices to slice your data, set batch_size=1 and view your data to fake your batch size:

class MyDataset(Dataset):
    def __init__(self): = torch.randn(250, 1)
        self.batch_indices = [0, 100, 129, 150, 200, 250]

    def __getitem__(self, index):
        start_idx = self.batch_indices[index]
        end_idx = self.batch_indices[index+1]
        data =[start_idx:end_idx]
        return data
    def __len__(self):
        return len(self.batch_indices) - 1

dataset = MyDataset()
loader = DataLoader(

for data in loader:
    data = data.view(-1, 1)