2D convolution with 3D kernel

spacemeerkat · June 9, 2020, 1:39pm

I am trying to perform a convolution over the Height and Width dimensions of a batch of input tensor cubes using kernels (which I have made myself) for every depth slice, without any movement of the kernel in the 3rd dimension (in this case the depth).

So say I had a batch of 3 tensor cubes:

import torch 

batch = torch.rand((3,4,4,4))

I would like to convolve each cube with some 2D kernels (1 for each batch member):

weights = torch.rand((3,4,4))

but convolve only in the Height and Width dimensions somehow. I suspect using some variant of the following code block:

kernels = torch.nn.Parameter(weights,requires_grad=False)

output = torch.nn.functional.conv3d(batch,weights,padding=1)

Is there a preferred/optimal way to do this? The only way I can see is to perform a 3D convolution but somehow set the stride to be [0,1,1] which I don’t think PyTorch will allow me to do.

Perhaps this is in fact a very simple problem and one can do this with 2D convolutions and I would love to know how, but the key points here are:

The output should have the same shape as the input (obviously padding helps but you can’t just run a 2D convolution over each batch member or the depth axis will come out as 1?)
Each batch member should be convolved with its own kernel.

Many thanks in advance!

Edit:

I’ve been playing around and think I may have this working using the following code:

import torch 
from functions import makebeam

batch = torch.rand((8,1,30,30,30))

weights = torch.rand((1,1,1,3,3))

conv = torch.nn.Conv3d(1,1,4,padding=[0,1,1],bias=False)

with torch.no_grad():
    conv.weight = torch.nn.Parameter(weights,requires_grad=False)

output = conv(batch)
print(output.shape)

Although I am still confused as to how passing a 2D kernel to a 3D convolution over a 3D tensor performs a convolution over each depth slice

mailcorahul · June 11, 2020, 11:18am

I think what you are looking for is DepthWise Separable Convolution, where a 2D filter is used on 3d conv volume to avoid computation cost. And a constraint is that the number of channels(depth) of input and output will be the same. i.e. one 2D filter for every channel in the input conv volume.

spacemeerkat · June 30, 2020, 5:58pm

Thanks for the reply and sorry for my slow reply! I’ll take a look into DepthWise Separable Convolutions and get back to you

Update: I’ve had a look into depthwise separable convolutions and they appear to be the same as my edit above but where I’m explicitly creating stacked-2D kernels into one 3D kernel per object that I wish to convolve with, and in depthwise separable convolution you would convolve each channel with a 2D kernel and then stack the results into a 3D output…I think that’s how it works anyway