Hi @Bohan_Zhuang!
I’ve run into similar error, but I’ve also noticed that I’m not getting it when I disable shuffling. With shuffle=True
my code fails with similar error message, but setting the parameter to False
returns a correct-looking batch of tensors.
Edit:
Here’s the error output.
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-14-985e62e21c28> in <module>()
1 dataloader = DataLoader(DroneRGBEarlier(), batch_size=4, shuffle=True)
2
----> 3 for i, batch in enumerate(dataloader):
4 print(i, batch['x'].size(), batch['y'].size())
5 break
C:\Anaconda3\envs\ml\lib\site-packages\torch\utils\data\dataloader.py in __next__(self)
186 if self.num_workers == 0: # same-process loading
187 indices = next(self.sample_iter) # may raise StopIteration
--> 188 batch = self.collate_fn([self.dataset[i] for i in indices])
189 if self.pin_memory:
190 batch = pin_memory_batch(batch)
C:\Anaconda3\envs\ml\lib\site-packages\torch\utils\data\dataloader.py in default_collate(batch)
114 return batch
115 elif isinstance(batch[0], collections.Mapping):
--> 116 return {key: default_collate([d[key] for d in batch]) for key in batch[0]}
117 elif isinstance(batch[0], collections.Sequence):
118 transposed = zip(*batch)
C:\Anaconda3\envs\ml\lib\site-packages\torch\utils\data\dataloader.py in <dictcomp>(.0)
114 return batch
115 elif isinstance(batch[0], collections.Mapping):
--> 116 return {key: default_collate([d[key] for d in batch]) for key in batch[0]}
117 elif isinstance(batch[0], collections.Sequence):
118 transposed = zip(*batch)
C:\Anaconda3\envs\ml\lib\site-packages\torch\utils\data\dataloader.py in default_collate(batch)
94 storage = batch[0].storage()._new_shared(numel)
95 out = batch[0].new(storage)
---> 96 return torch.stack(batch, 0, out=out)
97 elif elem_type.__module__ == 'numpy' and elem_type.__name__ != 'str_' \
98 and elem_type.__name__ != 'string_':
C:\Anaconda3\envs\ml\lib\site-packages\torch\functional.py in stack(sequence, dim, out)
62 inputs = [t.unsqueeze(dim) for t in sequence]
63 if out is None:
---> 64 return torch.cat(inputs, dim)
65 else:
66 return torch.cat(inputs, dim, out=out)
RuntimeError: inconsistent tensor sizes at d:\pytorch\pytorch\torch\lib\th\generic/THTensorMath.c:2864
Edit 2:
After traversing the source @smth posted, I felt it was maybe crucial to shed light on the dataset also. My dataset is effectively a collection of inputs and targets. A single sample is a dict of the following structure:
sample = {
'x': tensor.DoubleTensor of size [3 x 32 x 32],
'y': tensor.FloatTensor of size [32 x 32]
}
Edit 3:
Oddly enough I noticed that when shuffle=True
, having the batch_size
be 2 at maximum allows for the code to run without problems. With any larger batch_size
the code will fail.