Dividing the image into patches

Roqyah_bdeen · July 8, 2023, 8:19am

I have an image batch with size [10,3,256,832]. I want to create a new tensor from each image by dividing each image in this batch into small windows in which the next window will move like in the convolution operation, I mean there is overlapping between the windows. I want also to control the size of each window and the stride. This pic shows what I mean.
could any one tell me how to do it please

ptrblck · July 8, 2023, 8:31am

tensor.unfold should work:

x = torch.arange(7*7).view(7, 7)
print(x)
# tensor([[ 0,  1,  2,  3,  4,  5,  6],
#         [ 7,  8,  9, 10, 11, 12, 13],
#         [14, 15, 16, 17, 18, 19, 20],
#         [21, 22, 23, 24, 25, 26, 27],
#         [28, 29, 30, 31, 32, 33, 34],
#         [35, 36, 37, 38, 39, 40, 41],
#         [42, 43, 44, 45, 46, 47, 48]])

out = x.unfold(0, 3, 1).unfold(1, 3, 1)
out = out.contiguous().view(-1, 3*3)
print(out)
# tensor([[ 0,  1,  2,  7,  8,  9, 14, 15, 16],
#         [ 1,  2,  3,  8,  9, 10, 15, 16, 17],
#         [ 2,  3,  4,  9, 10, 11, 16, 17, 18],
#         [ 3,  4,  5, 10, 11, 12, 17, 18, 19],
#         [ 4,  5,  6, 11, 12, 13, 18, 19, 20],
#         [ 7,  8,  9, 14, 15, 16, 21, 22, 23],
#         [ 8,  9, 10, 15, 16, 17, 22, 23, 24],
#         [ 9, 10, 11, 16, 17, 18, 23, 24, 25],
#         [10, 11, 12, 17, 18, 19, 24, 25, 26],
#         [11, 12, 13, 18, 19, 20, 25, 26, 27],
#         [14, 15, 16, 21, 22, 23, 28, 29, 30],
#         [15, 16, 17, 22, 23, 24, 29, 30, 31],
#         [16, 17, 18, 23, 24, 25, 30, 31, 32],
#         [17, 18, 19, 24, 25, 26, 31, 32, 33],
#         [18, 19, 20, 25, 26, 27, 32, 33, 34],
#         [21, 22, 23, 28, 29, 30, 35, 36, 37],
#         [22, 23, 24, 29, 30, 31, 36, 37, 38],
#         [23, 24, 25, 30, 31, 32, 37, 38, 39],
#         [24, 25, 26, 31, 32, 33, 38, 39, 40],
#         [25, 26, 27, 32, 33, 34, 39, 40, 41],
#         [28, 29, 30, 35, 36, 37, 42, 43, 44],
#         [29, 30, 31, 36, 37, 38, 43, 44, 45],
#         [30, 31, 32, 37, 38, 39, 44, 45, 46],
#         [31, 32, 33, 38, 39, 40, 45, 46, 47],
#         [32, 33, 34, 39, 40, 41, 46, 47, 48]])

Roqyah_bdeen · July 8, 2023, 8:40am

Thank you so much for your quick replay.
I have another 2 questions please if you don’t mind, first unfold(0,3,1), 3 is the window size which equals=3X3 so if I need to enlarge the window to 25 pixels, 3 should be 5 right?, 1 is the stride I think, so what is 0?
my second question is how can I do this operation for the 3 channels together, the output should have 3 batches, each batch represents one channel, what if I want to do the same for the 10 images together
I hope that my question is clear
thank you in advance

ptrblck · July 8, 2023, 9:49am

It’s the dimension as described in the docs.

Specify the spatial dimensions only instead of the batch or channel dimension in the unfold operation.

Roqyah_bdeen · July 8, 2023, 10:00am

for example, i want to apply this unfolding to an image with only 2 channels, but I got this error message

I don’t understand why
and how to specify the spatial dimension instead of the batch or channel dimension please? could you provide a line of code to explain it?

Roqyah_bdeen · July 8, 2023, 2:28pm

to explain more the operation I want to do
I have one image with 2 channels size=[10,2,256,382] , the second image has 3 channels with size=[10,3,256,382]. I want to divide both images into patches and I did using the following code:

image2d=torch.rand([10,2,256,382])
image3d=torch.rand([10,,3,256,382])
win_size=5
stride=1
image2d_patches=image2d.unfold(2,win_size,stride).unfold(3,win_size,stride)
image3d_patches=image3d.unfold(2,win_size,stride).unfold(3,win_size,stride)

after that, I need to multiply the center of each patch in image3d_patches with its surrounding pixels inside the same patch, and subtract the center of each patch in image2d_patches from each surrounding pixel. I did this step by first create two tensors of the centered pixels for all patches in image2d_patches and image3d_patches:

center_im2d=image2d_patches[:,:,:,:,math.floor(image2d_patches.shapes[4]/2)]
center_im3d=image3d_patches[:,:,:,:,math.floor(image3d_patches.shapes[4]/2)]

Finally, I perform the multiplication and subtraction as follows:

mul_out=center_im3d[i,j,k]*image3d_patches[i,j,k,:] for i in range(image3d_patches.shape[0]) for j in range(image3d_patches.shape[1]) for k in range(image3d_patches.shape[2])]

The subtraction is performed in the same way, by replacing multiplication with subtraction.
Actually, it works, but it is extremely slow even after I reduced the batch size from 10 to 4, is there any efficient way to perform this operation please , I would very much appreciate your help

ptrblck · July 8, 2023, 5:43pm

I would probably try to get id of the nested for loop and check if one could replace it with a single operation.