Drop_last doesn't drop the last batch

rahul_ram · October 7, 2024, 5:19pm

This is my dataset’s __getitem__ function:

def __getitem__(self, index):
        input_path = self.input_dir + self.list_of_files[index] + '.jpg'
        target_path = self.target_dir + self.list_of_files[index] + '.png'

        input = torch.from_numpy(cv2.imread(input_path))
        target = torch.IntTensor(cv2.imread(target_path,
                                            flags= cv2.IMREAD_UNCHANGED))

        if len(target.shape) == 2:
            target = target[None, :, :]

        for transform in self.transforms:
            input = transform(input)
            target = transform(target)

        target -= 1
        target = target.type(torch.int64)
        target = N.functional.one_hot(target, num_classes)
        target = target.squeeze().type(torch.float32).permute(2,0,1)

        return (input, target)

Here’s the code to my dataset objects and dataloaders:

training_data = OxfordPetDataset('trainval', transforms= transforms)
test_data = OxfordPetDataset('test', transforms= transforms)

train_dataloader = DataLoader(training_data, batch_size= batch_size,
                              shuffle= True, drop_last= True)
test_dataloader = DataLoader(test_data, batch_size= batch_size,
                             shuffle= True, drop_last= True)

transforms simply resizes the input into a certain dimension. Not entirely sure what’s wrong?

Edit: Here’s the stack trace…

Cell In[22], line 5
      2 train_loss = 0
      3 print(f"\nStarting epoch {_ + 1}\n~~~~~~~~~~~~~~")
----> 5 for index, (input, target) in enumerate(train_dataloader):
      6     input, target = input.cuda(), target.cuda()
      7     model.train()

File ~/miniconda3/envs/pytorch-basics/lib/python3.12/site-packages/torch/utils/data/dataloader.py:630, in _BaseDataLoaderIter.__next__(self)
    627 if self._sampler_iter is None:
    628     # TODO(https://github.com/pytorch/pytorch/issues/76750)
    629     self._reset()  # type: ignore[call-arg]
--> 630 data = self._next_data()
    631 self._num_yielded += 1
    632 if self._dataset_kind == _DatasetKind.Iterable and \
    633         self._IterableDataset_len_called is not None and \
    634         self._num_yielded > self._IterableDataset_len_called:

File ~/miniconda3/envs/pytorch-basics/lib/python3.12/site-packages/torch/utils/data/dataloader.py:673, in _SingleProcessDataLoaderIter._next_data(self)
    671 def _next_data(self):
    672     index = self._next_index()  # may raise StopIteration
--> 673     data = self._dataset_fetcher.fetch(index)  # may raise StopIteration
    674     if self._pin_memory:
    675         data = _utils.pin_memory.pin_memory(data, self._pin_memory_device)

File ~/miniconda3/envs/pytorch-basics/lib/python3.12/site-packages/torch/utils/data/_utils/fetch.py:55, in _MapDatasetFetcher.fetch(self, possibly_batched_index)
     53 else:
     54     data = self.dataset[possibly_batched_index]
---> 55 return self.collate_fn(data)

File ~/miniconda3/envs/pytorch-basics/lib/python3.12/site-packages/torch/utils/data/_utils/collate.py:317, in default_collate(batch)
    256 def default_collate(batch):
    257     r"""
    258     Take in a batch of data and put the elements within the batch into a tensor with an additional outer dimension - batch size.
    259 
   (...)
    315         >>> default_collate(batch)  # Handle `CustomType` automatically
    316     """
--> 317     return collate(batch, collate_fn_map=default_collate_fn_map)

File ~/miniconda3/envs/pytorch-basics/lib/python3.12/site-packages/torch/utils/data/_utils/collate.py:174, in collate(batch, collate_fn_map)
    171 transposed = list(zip(*batch))  # It may be accessed twice, so we use a list.
    173 if isinstance(elem, tuple):
--> 174     return [collate(samples, collate_fn_map=collate_fn_map) for samples in transposed]  # Backwards compatibility.
    175 else:
    176     try:

File ~/miniconda3/envs/pytorch-basics/lib/python3.12/site-packages/torch/utils/data/_utils/collate.py:142, in collate(batch, collate_fn_map)
    140 if collate_fn_map is not None:
    141     if elem_type in collate_fn_map:
--> 142         return collate_fn_map[elem_type](batch, collate_fn_map=collate_fn_map)
    144     for collate_type in collate_fn_map:
    145         if isinstance(elem, collate_type):

File ~/miniconda3/envs/pytorch-basics/lib/python3.12/site-packages/torch/utils/data/_utils/collate.py:214, in collate_tensor_fn(batch, collate_fn_map)
    212     storage = elem._typed_storage()._new_shared(numel, device=elem.device)
    213     out = elem.new(storage).resize_(len(batch), *list(elem.size()))
--> 214 return torch.stack(batch, 0, out=out)

RuntimeError: stack expects each tensor to be equal size, but got [225, 256, 256] at entry 0 and [500, 256, 256] at entry 1

ptrblck · October 7, 2024, 11:37pm

Could you explain how the error you are seeing is related to the usage of drop_last=True?
The code fails since the samples are mismatched so did you check the original shape of the image and the transformed shape making sure they can be stacked into a single batch?

rahul_ram · October 8, 2024, 3:44am

Ok so the issue seems to have been with the fact that I used torch.from_numpy. Initially, I was using ToTensor() and that wasn’t causing any issues. But once I upgraded my torch and torchvision versions, the deprecation was causing issues on its own. I changed it from

input = torch.from_numpy(cv2.imread(input_path))

to

input = Compose([ToImage(), 
                         ToDtype(torch.float32, scale= True)])(cv2.imread(input_path))

and that seemed to work. Apologies!