I have an image batch with size [10,3,256,832]. I want to create a new tensor from each image by dividing each image in this batch into small windows in which the next window will move like in the convolution operation, I mean there is overlapping between the windows. I want also to control the size of each window and the stride. This pic shows what I mean.
could any one tell me how to do it please
tensor.unfold
should work:
x = torch.arange(7*7).view(7, 7)
print(x)
# tensor([[ 0, 1, 2, 3, 4, 5, 6],
# [ 7, 8, 9, 10, 11, 12, 13],
# [14, 15, 16, 17, 18, 19, 20],
# [21, 22, 23, 24, 25, 26, 27],
# [28, 29, 30, 31, 32, 33, 34],
# [35, 36, 37, 38, 39, 40, 41],
# [42, 43, 44, 45, 46, 47, 48]])
out = x.unfold(0, 3, 1).unfold(1, 3, 1)
out = out.contiguous().view(-1, 3*3)
print(out)
# tensor([[ 0, 1, 2, 7, 8, 9, 14, 15, 16],
# [ 1, 2, 3, 8, 9, 10, 15, 16, 17],
# [ 2, 3, 4, 9, 10, 11, 16, 17, 18],
# [ 3, 4, 5, 10, 11, 12, 17, 18, 19],
# [ 4, 5, 6, 11, 12, 13, 18, 19, 20],
# [ 7, 8, 9, 14, 15, 16, 21, 22, 23],
# [ 8, 9, 10, 15, 16, 17, 22, 23, 24],
# [ 9, 10, 11, 16, 17, 18, 23, 24, 25],
# [10, 11, 12, 17, 18, 19, 24, 25, 26],
# [11, 12, 13, 18, 19, 20, 25, 26, 27],
# [14, 15, 16, 21, 22, 23, 28, 29, 30],
# [15, 16, 17, 22, 23, 24, 29, 30, 31],
# [16, 17, 18, 23, 24, 25, 30, 31, 32],
# [17, 18, 19, 24, 25, 26, 31, 32, 33],
# [18, 19, 20, 25, 26, 27, 32, 33, 34],
# [21, 22, 23, 28, 29, 30, 35, 36, 37],
# [22, 23, 24, 29, 30, 31, 36, 37, 38],
# [23, 24, 25, 30, 31, 32, 37, 38, 39],
# [24, 25, 26, 31, 32, 33, 38, 39, 40],
# [25, 26, 27, 32, 33, 34, 39, 40, 41],
# [28, 29, 30, 35, 36, 37, 42, 43, 44],
# [29, 30, 31, 36, 37, 38, 43, 44, 45],
# [30, 31, 32, 37, 38, 39, 44, 45, 46],
# [31, 32, 33, 38, 39, 40, 45, 46, 47],
# [32, 33, 34, 39, 40, 41, 46, 47, 48]])
Thank you so much for your quick replay.
I have another 2 questions please if you don’t mind, first unfold(0,3,1), 3 is the window size which equals=3X3 so if I need to enlarge the window to 25 pixels, 3 should be 5 right?, 1 is the stride I think, so what is 0?
my second question is how can I do this operation for the 3 channels together, the output should have 3 batches, each batch represents one channel, what if I want to do the same for the 10 images together
I hope that my question is clear
thank you in advance
It’s the dimension as described in the docs.
Specify the spatial dimensions only instead of the batch or channel dimension in the unfold
operation.
for example, i want to apply this unfolding to an image with only 2 channels, but I got this error message
I don’t understand why
and how to specify the spatial dimension instead of the batch or channel dimension please? could you provide a line of code to explain it?
to explain more the operation I want to do
I have one image with 2 channels size=[10,2,256,382] , the second image has 3 channels with size=[10,3,256,382]. I want to divide both images into patches and I did using the following code:
image2d=torch.rand([10,2,256,382])
image3d=torch.rand([10,,3,256,382])
win_size=5
stride=1
image2d_patches=image2d.unfold(2,win_size,stride).unfold(3,win_size,stride)
image3d_patches=image3d.unfold(2,win_size,stride).unfold(3,win_size,stride)
after that, I need to multiply the center of each patch in image3d_patches
with its surrounding pixels inside the same patch, and subtract the center of each patch in image2d_patches
from each surrounding pixel. I did this step by first create two tensors of the centered pixels for all patches in image2d_patches
and image3d_patches
:
center_im2d=image2d_patches[:,:,:,:,math.floor(image2d_patches.shapes[4]/2)]
center_im3d=image3d_patches[:,:,:,:,math.floor(image3d_patches.shapes[4]/2)]
Finally, I perform the multiplication and subtraction as follows:
mul_out=center_im3d[i,j,k]*image3d_patches[i,j,k,:] for i in range(image3d_patches.shape[0]) for j in range(image3d_patches.shape[1]) for k in range(image3d_patches.shape[2])]
The subtraction is performed in the same way, by replacing multiplication with subtraction.
Actually, it works, but it is extremely slow even after I reduced the batch size from 10 to 4, is there any efficient way to perform this operation please , I would very much appreciate your help
I would probably try to get id of the nested for loop and check if one could replace it with a single operation.