Dataloader Iterator Issue

rahuldey91 · June 10, 2017, 1:01pm

Hi. I am starting with PyTorch and trying to write a simple classification code. I am using CIFAR10 dataset to train the model. This part of the code downloaded the dataset and stores it.

elif self.dataset_train == 'cifar10':
		self.dataset_train = datasets.CIFAR10(root=self.args.dataroot, train=True, download=True,
			transform=transforms.Compose([
				transforms.Scale(self.resolution),
				transforms.ToTensor(),
				transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)),
				])
			)

And this create the dataloader.
dataloader_train = torch.utils.data.DataLoader(self.dataset_train, batch_size=self.args.batch_size,
shuffle=True, num_workers=int(self.args.nthreads))

However, when I am using the dataloader to iterate through the dataset, using just a for loop (in a different function with parameter name as dataloader) like this:

for i, data in enumerate(dataloader, 0):			
input, label = data
...

I am getting the following error:

/home/rahul/CANVAS/pytorchnet/train.py(84)train()
→ for i, data in enumerate(dataloader, 0):
(Pdb) n
–Return–
/home/rahul/CANVAS/pytorchnet/train.py(84)train()->None
→ for i, data in enumerate(dataloader, 0):
(Pdb) n
TypeError: TypeErro…nt'\n’,)
/home/rahul/CANVAS/pytorchnet/main.py(62)()
→ loss_train = trainer.train(epoch, loader_train)
(Pdb) n
–Return–
/home/rahul/CANVAS/pytorchnet/main.py(62)()->None
→ loss_train = trainer.train(epoch, loader_train)
(Pdb) n
Traceback (most recent call last):
File “main.py”, line 62, in
loss_train = trainer.train(epoch, loader_train)
File “/home/rahul/CANVAS/pytorchnet/train.py”, line 84, in train
for i, data in enumerate(dataloader, 0):
File “/home/rahul/anaconda2/lib/python2.7/site-packages/torch/utils/data/dataloader.py”, line 212, in next
return self._process_next_batch(batch)
File “/home/rahul/anaconda2/lib/python2.7/site-packages/torch/utils/data/dataloader.py”, line 239, in _process_next_batch
raise batch.exc_type(batch.exc_msg)
TypeError: Traceback (most recent call last):
File “/home/rahul/anaconda2/lib/python2.7/site-packages/torch/utils/data/dataloader.py”, line 41, in _worker_loop
samples = collate_fn([dataset[i] for i in batch_indices])
File “/home/rahul/anaconda2/lib/python2.7/site-packages/torchvision-0.1.8-py2.7.egg/torchvision/datasets/cifar.py”, line 99, in getitem
img = self.transform(img)
File “/home/rahul/anaconda2/lib/python2.7/site-packages/torchvision-0.1.8-py2.7.egg/torchvision/transforms.py”, line 29, in call
img = t(img)
File “/home/rahul/anaconda2/lib/python2.7/site-packages/torchvision-0.1.8-py2.7.egg/torchvision/transforms.py”, line 139, in call
ow = int(self.size * w / h)
TypeError: unsupported operand type(s) for /: ‘tuple’ and ‘int’

When I try to debug using pdb, I see that inside the PyTorch’s dataloader.py file, the batch becomes an instance of ExceptionWrapper. It is not clear as exactly what am I doing wrong as I have run the same kind of loops on dataloader previously with CIFAR10. Please help!

spruceb · June 10, 2017, 2:50pm

The problem is with your Scale transform. I’m guessing self.resolution is a tuple? For some reason the documentation for torchvision is for the newest version, in which Scale does take a tuple, but the most recent release (which you presumably have) only takes a single int. That being the desired size of the smaller edge. You should either change to that (if possible, since it does require the scaled version to have the same aspect ratio), or install torchvision from HEAD.