Making custom image to image dataset using collate_fn and dataloader

I have made a dataset using pytoch dataloader and Imagefolder, my dataset class has two Imagefolder dataset. These two datasets are paired(original and ground truth image). I want to feed these to pytorch neural network. Dataset class:

class bsds_dataset(Dataset):
    def __init__(self, ds_main, ds_energy):
        self.dataset1 = ds_main
        self.dataset2 = ds_energy
    
    def __getitem__(self, index):
        x1 = self.dataset1[index]
        x2 = self.dataset2[index]
        
        return x1, x2
    
    def __len__(self):
        return len(self.dataset1)

and my collate function:

def my_collate(batch):
    data = [item[0] for item in batch]
    target = [item[1] for item in batch]
    target = torch.LongTensor(target)
    return [data, target]

my I am trying to use images as original and target data. Like image segmentation task. I have asked this question an answer reffered me to this post for detailed example and I used that snippet but its not working for me because of size of scalars, I don’t sure what ot do please help me.
Loading batches:

original_imagefolder = './images/whole'
target_imagefolder = './results/whole'

original_ds = ImageFolder(original_imagefolder, transform=transforms.ToTensor())
energy_ds = ImageFolder(target_imagefolder, transform=transforms.ToTensor())

dataset = bsds_dataset(original_ds, energy_ds)
loader = DataLoader(dataset, batch_size=16, collate_fn=my_collate)

for epoch in range(epochs):
    for i, x, y in enumerate(loader):
        print(x)

and the full traceback (including warning):

C:\Anaconda3\envs\torchgpu\lib\site-packages\ipykernel_launcher.py:77: UserWarning: nn.init.xavier_normal is now deprecated in favor of nn.init.xavier_normal_.
C:\Anaconda3\envs\torchgpu\lib\site-packages\ipykernel_launcher.py:78: UserWarning: nn.init.constant is now deprecated in favor of nn.init.constant_.
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-42-4646d595e649> in <module>
      5 optimizer = optim.SGD(model.parameters(), lr=0.001)
      6 for epoch in range(epochs):
----> 7     for i, x, y in enumerate(loader):
      8         print(x)

C:\Anaconda3\envs\torchgpu\lib\site-packages\torch\utils\data\dataloader.py in __next__(self)
    558         if self.num_workers == 0:  # same-process loading
    559             indices = next(self.sample_iter)  # may raise StopIteration
--> 560             batch = self.collate_fn([self.dataset[i] for i in indices])
    561             if self.pin_memory:
    562                 batch = _utils.pin_memory.pin_memory_batch(batch)

<ipython-input-38-0a73fb00a6d1> in my_collate(batch)
      2     data = [item[0] for item in batch]
      3     target = [item[1] for item in batch]
----> 4     target = torch.LongTensor(target)
      5     return [data, target]

ValueError: only one element tensors can be converted to Python scalars

If I understand your use case correctly, your targets are the segmentation masks to the data.
If that’s the case, you should handle them likewise and don’t wrap them in a torch.LongTensor, since they have variable sizes.
Try to return the data and target as a list of lists:

def my_collate(batch):
    data = [item[0] for item in batch]
    target = [item[1] for item in batch]
    return [data, target]
1 Like

I tried it but, this error occured:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-147-d1dea1bc00f8> in <module>
     15     for i, batch in enumerate(loader):
     16         original, target = batch
---> 17         out = model(original)

C:\Anaconda3\envs\torchgpu\lib\site-packages\torch\nn\modules\module.py in __call__(self, *input, **kwargs)
    491             result = self._slow_forward(*input, **kwargs)
    492         else:
--> 493             result = self.forward(*input, **kwargs)
    494         for hook in self._forward_hooks.values():
    495             hook_result = hook(self, input, result)

<ipython-input-7-5f743c3455c4> in forward(self, x)
     89         # encoder pathway, save outputs for merging
     90         for i, module in enumerate(self.down_convs):
---> 91             x, before_pool = module(x)
     92             encoder_outs.append(before_pool)
     93 

C:\Anaconda3\envs\torchgpu\lib\site-packages\torch\nn\modules\module.py in __call__(self, *input, **kwargs)
    491             result = self._slow_forward(*input, **kwargs)
    492         else:
--> 493             result = self.forward(*input, **kwargs)
    494         for hook in self._forward_hooks.values():
    495             hook_result = hook(self, input, result)

<ipython-input-5-26a0f7e21ea6> in forward(self, x)
     14 
     15     def forward(self, x):
---> 16         x = F.relu(self.conv1(x))
     17         x = F.relu(self.conv2(x))
     18         before_pool = x

C:\Anaconda3\envs\torchgpu\lib\site-packages\torch\nn\modules\module.py in __call__(self, *input, **kwargs)
    491             result = self._slow_forward(*input, **kwargs)
    492         else:
--> 493             result = self.forward(*input, **kwargs)
    494         for hook in self._forward_hooks.values():
    495             hook_result = hook(self, input, result)

C:\Anaconda3\envs\torchgpu\lib\site-packages\torch\nn\modules\conv.py in forward(self, input)
    336                             _pair(0), self.dilation, self.groups)
    337         return F.conv2d(input, self.weight, self.bias, self.stride,
--> 338                         self.padding, self.dilation, self.groups)
    339 
    340 

TypeError: conv2d(): argument 'input' (position 1) must be Tensor, not list

can you please help me, Thanks a lot

How would you like to process the variable shaped tensors?
Your custom collate_fn should work now and should return a list with data and target tensors.
However, as the new error message states, you cannot pass a list of tensors to your model, so you could e.g. loop over the data and target pairs and pass the samples one by one to the model.

@ptrblck Thank you for clarifying but I still can not train the model. I am trying to access each sample with loop but it needed the batch dimension:

for epoch in range(epochs):
    for i, batch in enumerate(loader):
        for item in range(batch_size):
            original_list, target_list = batch
            original = original_list[item][0]
            target = target_list[item][0]
            
            out = model(original)
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-95-2249137957c2> in <module>
     19             target = target_list[item][0]
     20 
---> 21             out = model(original)

C:\Anaconda3\envs\torchgpu\lib\site-packages\torch\nn\modules\module.py in __call__(self, *input, **kwargs)
    491             result = self._slow_forward(*input, **kwargs)
    492         else:
--> 493             result = self.forward(*input, **kwargs)
    494         for hook in self._forward_hooks.values():
    495             hook_result = hook(self, input, result)

<ipython-input-7-5f743c3455c4> in forward(self, x)
     89         # encoder pathway, save outputs for merging
     90         for i, module in enumerate(self.down_convs):
---> 91             x, before_pool = module(x)
     92             encoder_outs.append(before_pool)
     93 

C:\Anaconda3\envs\torchgpu\lib\site-packages\torch\nn\modules\module.py in __call__(self, *input, **kwargs)
    491             result = self._slow_forward(*input, **kwargs)
    492         else:
--> 493             result = self.forward(*input, **kwargs)
    494         for hook in self._forward_hooks.values():
    495             hook_result = hook(self, input, result)

<ipython-input-5-26a0f7e21ea6> in forward(self, x)
     14 
     15     def forward(self, x):
---> 16         x = F.relu(self.conv1(x))
     17         x = F.relu(self.conv2(x))
     18         before_pool = x

C:\Anaconda3\envs\torchgpu\lib\site-packages\torch\nn\modules\module.py in __call__(self, *input, **kwargs)
    491             result = self._slow_forward(*input, **kwargs)
    492         else:
--> 493             result = self.forward(*input, **kwargs)
    494         for hook in self._forward_hooks.values():
    495             hook_result = hook(self, input, result)

C:\Anaconda3\envs\torchgpu\lib\site-packages\torch\nn\modules\conv.py in forward(self, input)
    336                             _pair(0), self.dilation, self.groups)
    337         return F.conv2d(input, self.weight, self.bias, self.stride,
--> 338                         self.padding, self.dilation, self.groups)
    339 
    340 

RuntimeError: Expected 4-dimensional input for 4-dimensional weight 64 3 3, but got 3-dimensional input of size [3, 321, 481] instead

And if I access first dimension, an error occurs saying that the input should tensor not list:

for epoch in range(epochs):
    for i, batch in enumerate(loader):
        for item in range(batch_size):
            original_list, target_list = batch
            original = original_list[item]
            out = model(original)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-103-1b36111ac68f> in <module>
     19             target = target_list[item][0]
     20 
---> 21             out = model(original)

C:\Anaconda3\envs\torchgpu\lib\site-packages\torch\nn\modules\module.py in __call__(self, *input, **kwargs)
    491             result = self._slow_forward(*input, **kwargs)
    492         else:
--> 493             result = self.forward(*input, **kwargs)
    494         for hook in self._forward_hooks.values():
    495             hook_result = hook(self, input, result)

<ipython-input-7-5f743c3455c4> in forward(self, x)
     89         # encoder pathway, save outputs for merging
     90         for i, module in enumerate(self.down_convs):
---> 91             x, before_pool = module(x)
     92             encoder_outs.append(before_pool)
     93 

C:\Anaconda3\envs\torchgpu\lib\site-packages\torch\nn\modules\module.py in __call__(self, *input, **kwargs)
    491             result = self._slow_forward(*input, **kwargs)
    492         else:
--> 493             result = self.forward(*input, **kwargs)
    494         for hook in self._forward_hooks.values():
    495             hook_result = hook(self, input, result)

<ipython-input-5-26a0f7e21ea6> in forward(self, x)
     14 
     15     def forward(self, x):
---> 16         x = F.relu(self.conv1(x))
     17         x = F.relu(self.conv2(x))
     18         before_pool = x

C:\Anaconda3\envs\torchgpu\lib\site-packages\torch\nn\modules\module.py in __call__(self, *input, **kwargs)
    491             result = self._slow_forward(*input, **kwargs)
    492         else:
--> 493             result = self.forward(*input, **kwargs)
    494         for hook in self._forward_hooks.values():
    495             hook_result = hook(self, input, result)

C:\Anaconda3\envs\torchgpu\lib\site-packages\torch\nn\modules\conv.py in forward(self, input)
    336                             _pair(0), self.dilation, self.groups)
    337         return F.conv2d(input, self.weight, self.bias, self.stride,
--> 338                         self.padding, self.dilation, self.groups)
    339 
    340 

TypeError: conv2d(): argument 'input' (position 1) must be Tensor, not tuple

How can I pass the batch dimension? Can you please explain?
Thanks a lot

You could unsqueeze the batch dimension via original = original.unsqueeze(0) before passing it to the model.

Thanks for the idea. Just wondering, if it is possible to pass the list of tensors to the model directly as it will be much faster?

You could pass a list to the model and apply a loop internally to forward each sample, which would be slower than the batched approach. Also it would most likely break data parallel approaches.

The advantage of a batch of inputs is that the operations in your model can directly use the batched data without unwrapping it. If you use a Python list, the methods won’t accept these lists at the moment, and you would have to add a “fake batch dimension” to each sample.

There is ongoing work to add utilities such as vmap etc., which might relax this condition in the future.

thanks for the reply! But what would be the difference between putting batch_size = 1 or, apply a loop internally to forward each sample? it kinda baffles me

If you are using a batch size of 1, you wouldn’t need the loop, so there wouldn’t be a difference.

However, if your batch size is e.g. 64, it would make a difference to loop over each sample or pass the batch directly to the model.