Transform NxWxH without loop / with autograd

trypag · November 20, 2017, 4:05pm

Hi,

I am looking for some kind of way to transform a NxWxH LongTensor into a (NxWxH)x9 LongTensor. The transform would be to extract a square patch at each position, it would return a 2D tensor (NxWxH) lines x (number of elements in square) columns. It has to take into account that if a full square cannot be extracted, it should be filled with a value (border limit case), the original tensor can be padded before of course.
I already know how to do this with a for loop over each dimensions, however I am looking for a smarter way to break complexity, currently I can’t think of a way to use a vectorized op to do this. It needs to support autograd, has I don’t want to break the graph.

it would look like :

target = torch.LongTensor(4, 15, 15)

# magic transform here

# shape of expected tensor
trans_target = torch.LongTensor(4 * 15 * 15, 9)

This question might be a little ambitious, but I know some of you know how to be really creative when it comes to this kind a tensor manipulation

I will post updates if I find anything interesting.
Thanks !!

SimonW · November 20, 2017, 4:20pm

You can do a 2d convolution with a specific weight tensor of shape (9, 1, 3, 3) to achieve this… although the tensor will have a lot of 0 entries and conv fwd&bwd is not super optimized for this case. I’m not aware of any better way to do this other than write a custom cuda kernel though…

SimonW · November 20, 2017, 4:25pm

oh wait, maybe this is better:

pad = nn.ConstantPad2D(1, filler)
input = pad(input.unsqueeze(1))
shifted = []
for i in range(9):
  shifted.append(input[... shift it here])
output = torch.cat(shifted, 1)

trypag · November 20, 2017, 4:49pm

@SimonW your approach is really interesting, however I am not sure to understand how I can implement the shift part, can you give a simple case like right or left shift please ? did you mean shifting with a convolution ?

SimonW · November 20, 2017, 4:53pm

Sure. So after padding, input is now of shape (nbatch, 1, h+2, w+2).

Shifting to left by 1 is input[:, :, 2:, 1:-1].
Shifting to left by 1 and to top by 1 is input[:, :, 2:, 2:].

Disclaimer: I completely theory-crafted this in my head and haven’t actually tried. But I think it should work.

trypag · November 20, 2017, 4:57pm

Alright, I like that, thank you @SimonW you rock !