Hi

I was trying to use dataloader to enumerate my training samples but I don’t understand why it is slower than “manual batching”

**"Manual batching":**

```
samples_tensor = torch.tensor(samples, dtype=torch.float).cuda()
labels_tensor = torch.tensor(labels, dtype=torch.long).cuda()
for e in range(nbEpochs):
for b in range(nbSamples // batch_size):
x = samples_tensor[b * batch_size:(b+1)*batch_size]
y = labels_tensor[b * batch_size:(b+1)*batch_size]
```

**"With dataloader":**

```
from torch.utils.data import DataLoader
import torch.utils.data as utils
samples_tensor = torch.tensor(samples, dtype=torch.float).cuda()
labels_tensor = torch.tensor(labels, dtype=torch.long).cuda()
dset = utils.TensorDataset(samples_tensor, labels_tensor)
data_train_loader = DataLoader(dset, batch_size=1000, shuffle=True)
for e in range(nbEpochs):
for _, (x,y) in enumerate(data_train_loader):
pass
```

the variant with dataloader is MUCH slower than the manual process. Am I missing something?

Thanks