I need to implement a 2D convolution that operates on xyz coordinates.
My data is made by ordered points, with shape HxW and each point has 3 coordinates.
Ideally I can use nn.Conv2D and slide over exactly like an image.
However, before the first convolution with kernel size 3x3 I need to subtract the central point coordinates from each of the neighbours. In my understanding, this is to build a “local” feature.
Question is: how can I efficiently write this custom nn.Conv2d? Can I do it with nn.Module?
I checked here but I wasn’t able to find an example of unfold. Also, not sure if it would be of reasonable performance.
Apologies if it is a silly question.
Hello, No question is silly, please don’t worry these forums are very helpful for people like us.
As i understand from your question you need to do a step before convolution layer right. i think all you need is hooks for your task. Please check this link if it is similar to your problem. [1]
def mycustomfunction(self, input, output):
# do my stuff
# I need to access each local block *just before* convolution
# Should I use unfold here? How?
layer = MyConv(3,64)
layer.conv[0].register_forward_hook(mycustomfunction)
Option 2, would be something like:
class MyConv(nn.Module):
def __init__(self, in_channels, out_channels):
super().__init__()
self.conv = nn.Sequential(
### Not sure how to use nn.Unfold here!
nn.BatchNorm3d(mid_channels),
nn.ReLU(inplace=True),
nn.Dropout(p=0.25),
)
def forward(self, x):
return self.conv(x)
I simply need to subtract the central item from the all local block before the convolution.
In the end, adapting the code from this issue, I was able to do the following:
class OffsetConv2d(nn.Conv2d):
"""
2d convolution with offset of central features
See Also
--------
https://github.com/pytorch/pytorch/issues/47990
"""
def __init__(self, in_channels, out_channels, kernel_size=3, stride=1, padding=0, bias=True):
super().__init__(in_channels, out_channels, kernel_size, stride, padding, bias)
self.k = kernel_size
self.stride = stride
self.padding = padding
self.conv = nn.Parameter(torch.ones(out_channels, in_channels, kernel_size, kernel_size))
def forward(self, x):
k = self.k
stride = self.stride
h_in, w_in = x.shape[2], x.shape[3]
padding = self.padding
batch_size = x.shape[0]
h_out = (h_in + 2 * padding - (k - 1) - 1) / stride + 1
w_out = (w_in + 2 * padding - (k - 1) - 1) / stride + 1
h_out, w_out = int(h_out), int(w_out)
inp_unf = F.unfold(x, (k, k), padding=padding)
# DO MY STUFF HERE
# Offset each local block with its center
out_unf = inp_unf.transpose(1, 2).matmul(self.conv.view(self.conv.size(0), -1).t()).transpose(1, 2)
out = F.fold(out_unf, (h_out, w_out), (1, 1))
return out
I compared the above code with the traditional nn.Conv2d and I didn’t find any significant performance difference. So I will go ahead with this.
I hope that it is right and the above code will be useful to someone else in the same situation.