How to efficiently assign features from a 3x3 window in another matrix

Hello,

I have a point cloud - node having their own coordinates (x, y). So, for instance we have a point cloud:

X =    [[1,1], 
        [2,2], 
        [3,4]]

Then, I have a large matrix that has some features, let’s say a matrix 8,8

M =     [[1,2,3,2,5,6,7,8],
        [4,5,6,1,8,1,2,4],
        [4,5,H,3,8,1,2,4],
        [1,5,4,4,1,1,2,3],
        [2,5,2,7,1,1,4,3],
        [3,5,4,8,5,1,4,3],
        [4,4,3,6,4,1,1,3],
        [4,4,6,5,8,1,3,5]]

From the task I know that features around the coordinates are important for the point cloud (let’s say the “kernel size is 3” and i want to get flattened features from those positions. So, for point [2,2] we’d get 5,6,1,5,3,5,4. - Then I want to concatenate them to point cloud features - so for point [2,2] we’d get [2, 2, 5, 6, 1, 5, 3, 5, 4].

I know probably how to do that via CPU, but is there any good algorithm how to do that efficiently? It would be done on-the-fly during training on GPU.

thanks!

I think this achieves the spirit of what you’re trying to accomplish but some tweaks might be needed depending on how you want to handle the H/center of each patch, etc…

import torch

X =    [[1,1],
        [2,2],
        [3,4]]

M =     [[1,2,3,2,5,6,7,8],
        [4,5,6,1,8,1,2,4],
        [4,5,0,3,8,1,2,4],
        [1,5,4,4,1,1,2,3],
        [2,5,2,7,1,1,4,3],
        [3,5,4,8,5,1,4,3],
        [4,4,3,6,4,1,1,3],
        [4,4,6,5,8,1,3,5]]

X = torch.tensor(X, device='cuda', dtype=torch.float)
M = torch.tensor(M, device='cuda', dtype=torch.float)

unfold = torch.nn.Unfold(3, padding=1)
M2 = unfold(M.reshape(1, 1, M.size(0), M.size(1)))

X_flat = X[:,0]*M.size(1) + X[:,1]
M2 = M2.permute(0, 2, 1)
out = M2[:,X_flat.long(),:]
print(out.reshape(X.size(0), -1))
tensor([[1., 2., 3., 4., 5., 6., 4., 5., 0.],
        [5., 6., 1., 5., 0., 3., 5., 4., 4.],
        [3., 8., 1., 4., 1., 1., 7., 1., 1.]], device='cuda:0')

Thank you this is exactly what I wanted. I cannot get my head around the unfold operation… By the way, is this easily changed into M with more channels? (concatenation even over channels).

Thanks.

There should be no restriction on the number of channels as unfold is basically doing a “convolution” without the reduction.