Create batch of image from list of images

Kenzo · July 8, 2022, 1:21pm

Im trying to create a batch of images from a list of images. However, sometimes the batch goes longer than the batch_size that I defined. For example, I defined the batch_size to 15, but when I get the len(), I have 17 images sometimes instead of 15. Any help would be appreciated!

Here is my code that I’m trying to create the batch of images:

    def _to_batch(self, imgs: List[torch.Tensor], device) -> torch.Tensor:
        batch_list = [x.unsqueeze(0) for x in imgs]
        if self.batch_size > len(batch_list):
            fill_size = self.batch_size - len(batch_list)
            batch_list.append(torch.zeros([fill_size, 3, self._img_size[0], self._img_size[1]]).to(device))
        batch = torch.cat(batch_list, 0).half()
        return batch

I know I have to check if my total list of images is more than my batch_size as well and keep those images for the next batch, but cannot find the logic to do it

ParGG · July 8, 2022, 3:20pm

It’s good that you found that issue.

To keep the images for the next batch, just define self.batch_list = [] in your model. Inside your function _to_batch, extend your list with self.batch_list.extend([x.unsqueeze(0) for x in imgs]). Then pretty much the remaining part of the code is the same. You just need to substitute the use of batch_list with self.batch_list and update its content before concatenation (slice and store the remainder in self.batch_list, whilst assign the selection to another variable that will be used for torch.cat).

I hope this could help you. If you need further help I can write you the code.

Kenzo · July 10, 2022, 7:44am

@ParGG Thanks for the help. I tried but still cannot keep the images for the next batch. Can you elaborate more?

ParGG · July 10, 2022, 1:11pm

Here I subclass your class to give you an example and have overridden your _to_batch implementation. I wrote some comments for you to understand the different steps. Let me know if it works for you or have any question.

class MyClass(YourClass):
    def __init__(*args, **kwargs):
        super().__init__(*args, **kwargs)
        # define the container for your tensors to be batched
        self.batch_list = []
 
    def _to_batch(self, imgs: List[torch.Tensor], device) -> torch.Tensor:
        # extend the list with the new loaded tensors
        self.batch_list.extend([x.unsqueeze(0) for x in imgs])
        
        # if the list has a length smaller than the batch size
        if (fill_size := self.batch_size - len(self.batch_list)) > 0:
            # fill the list to the desired length with zeroed tensors
            self.batch_list.append(torch.zeros([fill_size, 3, self._img_size[0], self._img_size[1]]).to(device))

        # here the list is of at least `batch_size` elements
        # take the first `batch_size` elements and
        # put them in a variable that you will use for concatenation
        batch_list = self.batch_list[:self.batch_size]

        # take the remaining tensors and propagate them to the next call
        self.batch_list = self.batch_list[self.batch_size:]
       
        # concatenate the selected tensors and return them as a batch
        return torch.cat(batch_list, 0).half()

Kenzo · July 12, 2022, 3:11pm

Hey @ParGG, Thanks for completing the code and explanations. Well, it doesn’t work completely okay. Getting these errors sometimes:

list index out of range
list index out of range
list index out of range
list index out of range
tile cannot extend outside image
tile cannot extend outside image
list index out of range
list index out of range

ParGG · July 12, 2022, 6:31pm

Those errors could come from other parts of your code, can you check the traceback? To see if the code works, just put a breakpoint in the function and see if everything works as expected.