Creating a dataloader for multi-page tiff images

JVedant · September 10, 2021, 1:37pm

Hi, I am trying to create a custom dataset that loads multi-page tiff files (shape=(h, w, d, c)). Here the channel’s value is 4 and so when applying transforms.ToTensor() on the image, it shows an error of

ValueError: Caught ValueError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "/home/deadsec/miniconda3/envs/torchsm_86/lib/python3.8/site-packages/torch/utils/data/_utils/worker.py", line 287, in _worker_loop
    data = fetcher.fetch(index)
  File "/home/deadsec/miniconda3/envs/torchsm_86/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/home/deadsec/miniconda3/envs/torchsm_86/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 44, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "<ipython-input-2-61874812cd51>", line 31, in __getitem__
    img = self.img_transform(img)
  File "/home/deadsec/miniconda3/envs/torchsm_86/lib/python3.8/site-packages/torchvision/transforms/transforms.py", line 60, in __call__
    img = t(img)
  File "/home/deadsec/miniconda3/envs/torchsm_86/lib/python3.8/site-packages/torchvision/transforms/transforms.py", line 97, in __call__
    return F.to_tensor(pic)
  File "/home/deadsec/miniconda3/envs/torchsm_86/lib/python3.8/site-packages/torchvision/transforms/functional.py", line 105, in to_tensor
    raise ValueError('pic should be 2/3 dimensional. Got {} dimensions.'.format(pic.ndim))
ValueError: pic should be 2/3 dimensional. Got 4 dimensions.
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-6-9b4568424e81> in <module>
----> 1 for each in train_loader:
      2     print(each["image"].shape)
      3     print(each["mask"].shape)
      4     break

~/miniconda3/envs/torchsm_86/lib/python3.8/site-packages/torch/utils/data/dataloader.py in __next__(self)
    519             if self._sampler_iter is None:
    520                 self._reset()
--> 521             data = self._next_data()
    522             self._num_yielded += 1
    523             if self._dataset_kind == _DatasetKind.Iterable and \

~/miniconda3/envs/torchsm_86/lib/python3.8/site-packages/torch/utils/data/dataloader.py in _next_data(self)
   1201             else:
   1202                 del self._task_info[idx]
-> 1203                 return self._process_data(data)
   1204 
   1205     def _try_put_index(self):

~/miniconda3/envs/torchsm_86/lib/python3.8/site-packages/torch/utils/data/dataloader.py in _process_data(self, data)
   1227         self._try_put_index()
   1228         if isinstance(data, ExceptionWrapper):
-> 1229             data.reraise()
   1230         return data
   1231 

~/miniconda3/envs/torchsm_86/lib/python3.8/site-packages/torch/_utils.py in reraise(self)
    423             # have message field
    424             raise self.exc_type(message=msg)
--> 425         raise self.exc_type(msg)
    426 
    427

What can be done to solve/overcome this ?

ptrblck · September 11, 2021, 5:55am

ToTensor() checks image arrays to have 2 or 3 dimensions to make sure they are either grayscale or color images. Based on your description you are using a volume (with an additional depth dimension) so depending on your use case you could transform the numpy array to a tensor manually via x = torch.from_numpy(img) and normalize it afterwards if needed.