Split an image into four equal coordinates

Is there a function in PyTorch that allows me to split an image into four equal coordinates?

Thanks a lot.

Could you explain a bit what coordinates mean in the context of an image? :slight_smile:

Iā€™m trying to write a code which count objects in an image. I would like to spilt the image into 4 equal areas then run a count on each part then sum them all. Many Thanks for your help.

To create 4 patches of your input, you could use unfold as shown here:

x = torch.randn(3, 24, 24) # channels, height, width
kernel_size, stride = 12, 12
patches = x.unfold(1, kernel_size, stride).unfold(2, kernel_size, stride)
patches = patches.contiguous().view(patches.size(0), -1, kernel_size, kernel_size)
print(patches.shape) # channels, patches, kernel_size, kernel_size
> torch.Size([3, 4, 12, 12])

Once you have these patches, you could pass them to your model separately or as a batch.

3 Likes

I am quite new to PyTorch, based on what have you selected the values please of the channels, height, width and the kernal_size and stride. Many Thanks

I selected the channels randomly as 3, since it would correspond to an RGB image.
The kernel size and stride of 12 were chosen to create 4 patches of the 24x24 spatial input shape.
You might need to change it, if your input shape is different.

@ptrblck Many Thanks for your help. I would like to implement this into a training model. My aim by doing this, is to increase the number of images for training. The idea i have is to split each image into 4 patches prior to the model training, which works by counting the number of cells in a microscopy image. After splitting each image into 4 patches i would like to do a count of each patch then sum the total. Is this feasible to do or would you recommend a different method? Many Thanks

Hey ptrblck, I have used your code to create patches in fashion-mnist and I am struggling with understanding how to use them in my model or, do i create a new array of tensors or mini batches or do i separate the patches to create a longer array and how would i do that

Could you explain your use case a bit more and in particular how you would like to use these patches in the model?
The reshaping of these patches might depend on the actual model architecture, so I would need more information about your use case.

Hi ptrblck, thanks for your code. I am now wondering how can I fold back to the original image x after some computations on the patches. could this be implemented with fold?

Yes, fold can be used to recreate the input tensor assuming no overlapping patches were used before.