# Convolution operation without the final summation

Hello

Is there any way to perform a vanilla convolution operation but without the function summation? Assume that we a feature map, X, of size [B, 3, 64, 64] and a single kernel of size [1, 3, 3, 3]. When doing the vanilla convolution, we get a feature map of size [B, 1, 62, 62], while I’m after a way to get a feature map of size [B, 3, 62, 62], just before collapsing/summing all the convolutional channels into a single feature map

Thanks

How would you like to perform the reduction of each step?
Generally, you could `unfold` the input into `3x3x3` patches, perform the multiplication with the kernel, (sum the result), and `fold/reshape` to the output shape. Since you are not performing the sum, you would have overlapping patches and I’m not sure how you would like to reduce/reshape them back.

I want to avoid the reduction across the channels and not the spatial multiplication.
So, each kernel of size 3x3x3 gives three feature maps, instead of merging them to form a single feature map in the output.

In that case, the `groups` argument should yield the expected results using `groups=in_channels`:

``````conv = nn.Conv2d(3, 3, 3, groups=3)
x = torch.randn(10, 3, 64, 64)
output = conv(x)
``````
1 Like

I’m already familiar with this option but using `groups` is not actually what I need.

The operation you wrote is just performing convolution with a single kernel 3x3x3 kernel while I need the same operation for say 32 different kernels.

It’s a bit hard to explain what I am after.

Thanks for the time

Try to give an example with simple tensors and simple kernels, along with the expected results, so we can see what you want to do.

Hi @Saeed_Izadi1, did you find a solution to this problem?

@ptrblck What Saeed was after is the following:
A normal convolution between a [Bx3x5x5] input and a [1x3x3x3] kernel would produce a [Bx1x4x4] response. The reason for that is that after performing spatial convolution channel-wise, i.e. along the last 2 indices ( and producing a tensor [Bx1x3x4x4] ), the channel-wise responses are summed up into a single channel and thus convolution produces a tensor [Bx1x4x4]. Is there a way to obtain access to the tensor [Bx1x3x4x4] ?

Thanks a lot!

Regards,
David

Wouldn’t my code snippet yield exactly this?
Have a look at this comparison with a manual approach, where each kernel is used on a single input channel:

``````# Grouped approach
conv = nn.Conv2d(3, 3, 3, groups=3, bias=False)
x = torch.randn(10, 3, 64, 64)
output = conv(x)

# Compare with manual approach
kernels = conv.weight
output_manual = []
for idx in range(3):
kernel = kernels[idx:idx+1]
input = x[:, idx:idx+1]
out = F.conv2d(input, kernel)
output_manual.append(out)
output_manual = torch.cat(output_manual, dim=1)

print((output_manual - output).abs().max())
``````

Let me know, if I still misunderstand the use case.

Hi @ptrblck sorry about the confusion, you are completely right! That is exactly what your proposed approach does.

Best Regards and Happy New Year,
David

2 Likes

If I understand @Saeed_Izadi1 correctly, I think the correct way to achieve that should be something like:

conv = nn.Conv2d(3, 9, 3, groups=3)