Extract all nxm blocks

vainaijr · March 14, 2020, 10:04pm

If I have a 5x5 tensor, then, how do I get all 9 3x3 blocks from it, so that resulting tensor is of shape [9, 3, 3] or if these 3x3 blocks are flattened then [9, 9] shape. for example,

x = torch.randn(5, 5)

suppose x is

tensor([[ 0.5756,  0.2463,  1.3940,  0.8473, -0.8371],
        [ 0.9690,  1.4913, -0.2129,  0.8331, -0.6322],
        [-0.0348, -1.6920, -0.0157,  0.6159,  0.1038],
        [-1.0790,  1.4303,  0.3861,  0.1293,  0.4582],
        [ 0.2815, -1.1944, -0.7612,  0.6595,  1.4611]])

then resulting tensor should be like,

tensor([[0.5756,  0.2463,  1.3940, 0.9690,  1.4913, -0.2129, -0.0348, -1.6920, -0.0157],
 [0.2463, 1.3940,  0.8473, 1.4913, -0.2129,  0.8331, -1.6920, -0.0157,  0.6159],
...
[-0.0157,  0.6159,  0.1038, 0.3861,  0.1293,  0.4582, -0.7612,  0.6595,  1.4611]])

without using for loop

vainaijr · March 15, 2020, 6:09am

is unfold a way to do this, but I get a bit different result when I use unfold

for example if I want all (2x3) blocks

import torch.nn as nn
x = torch.randn(1, 1, 5, 5)
x

tensor([[[[-0.0197,  1.0647, -0.2223,  0.6515,  0.4126],
          [ 0.2328,  0.3959, -0.2910, -0.3321, -0.7934],
          [-0.5428, -0.8354, -0.4103, -0.8781,  0.6794],
          [ 0.0659, -1.7829,  0.0720, -0.3183,  2.2014],
          [-0.8724, -0.1767, -1.3356,  1.0183, -0.0641]]]])

a = nn.Unfold((2, 3))
a(x)

it give me something like this,

tensor([[[-0.0197,  1.0647, -0.2223,  0.2328,  0.3959, -0.2910, -0.5428,
          -0.8354, -0.4103,  0.0659, -1.7829,  0.0720],
         [ 1.0647, -0.2223,  0.6515,  0.3959, -0.2910, -0.3321, -0.8354,
          -0.4103, -0.8781, -1.7829,  0.0720, -0.3183],
         [-0.2223,  0.6515,  0.4126, -0.2910, -0.3321, -0.7934, -0.4103,
          -0.8781,  0.6794,  0.0720, -0.3183,  2.2014],
         [ 0.2328,  0.3959, -0.2910, -0.5428, -0.8354, -0.4103,  0.0659,
          -1.7829,  0.0720, -0.8724, -0.1767, -1.3356],
         [ 0.3959, -0.2910, -0.3321, -0.8354, -0.4103, -0.8781, -1.7829,
           0.0720, -0.3183, -0.1767, -1.3356,  1.0183],
         [-0.2910, -0.3321, -0.7934, -0.4103, -0.8781,  0.6794,  0.0720,
          -0.3183,  2.2014, -1.3356,  1.0183, -0.0641]]])

but this is not what I want,
how should I change this, to get 2x3 size blocks from input?

one way that I found, is to reshape this tensor,

a(x).reshape(1, 12, 6)

which gives

tensor([[[-0.0197,  1.0647, -0.2223,  0.2328,  0.3959, -0.2910],
         [-0.5428, -0.8354, -0.4103,  0.0659, -1.7829,  0.0720],
         [ 1.0647, -0.2223,  0.6515,  0.3959, -0.2910, -0.3321],
         [-0.8354, -0.4103, -0.8781, -1.7829,  0.0720, -0.3183],
         [-0.2223,  0.6515,  0.4126, -0.2910, -0.3321, -0.7934],
         [-0.4103, -0.8781,  0.6794,  0.0720, -0.3183,  2.2014],
         [ 0.2328,  0.3959, -0.2910, -0.5428, -0.8354, -0.4103],
         [ 0.0659, -1.7829,  0.0720, -0.8724, -0.1767, -1.3356],
         [ 0.3959, -0.2910, -0.3321, -0.8354, -0.4103, -0.8781],
         [-1.7829,  0.0720, -0.3183, -0.1767, -1.3356,  1.0183],
         [-0.2910, -0.3321, -0.7934, -0.4103, -0.8781,  0.6794],
         [ 0.0720, -0.3183,  2.2014, -1.3356,  1.0183, -0.0641]]])

I think so this works, but when I add channel to input, then this fails.

edit:
I found one way,

x.unfold(2, 2, 1).unfold(3, 3, 1)

2 sized slices for each row, and 3 sized slices for each column
if I wanted (4, 4) slices
then,

x.unfold(2, 4, 1).unfold(3, 4, 1)