So I have been trying to unfold an image tensor up into multiple sliding windows and then fold it back into an image. I have found multiple threads about this but none that have solved my problem. Right now I have successfully unfolded my images up into sliding windows like this:
im = torch.arange(0, 81).view(1,1,9,9) im2 = torch.arange(0, 81).view(1,1,9,9) x = torch.cat((im, im2), dim=0) # shape = [2,1,9,9] patches = x.unfold(2,3,1) print("unfold: ", patches.shape) # [2,1,7,9,3] patches = patches.unfold(3,3,1) print("unfold 2: ", patches.shape) # [2, 1, 7, 7, 3, 3] patches = patches.permute(0, 2, 3, 1, 4, 5).contiguous() print("after permute: ", patches.shape) #[2, 7, 7, 1, 3, 3] patches = patches.view(patches.size(), -1, patches.size(), patches.size()) print(patches.shape) # [2, 49, 3, 3]
At this point I have a bunch of sliding windows that I can work with… but before building a network around them I decided to try and make sure that I could fold them back. I have been working on this for a while now and seem to be stuck.
I have tried using fold in these ways:
fold = nn.Fold(output_size = (9,9), kernel_size = (3,3)) together = fold(patches) print(together.shape) fold = nn.Fold(output_size = 9, kernel_size = 3) together = fold(patches) print(together.shape)
But I keep running into problems. One big concern is that the documentation for fold seems wrong to me. It says: … warning::
Currently, only 4-D output tensors (batched image-like tensors) are
But when I try and fold 4d tensors I get this error:
Input Error: Only 3D input Tensors are supported (got 4D)
Any help would be much appreciated! Thanks.