# Convolution with several kernels on the same input image

Hi all,

I’m performing cross-correlations between images and kernels. But it’s common that I need to perform a cross-correlation on an unique image with many different kernels.

All kernels have the same shape, so a simple solution is to batch them, and duplicate the image along the batch dimension. Doing so, I have the following tensors:

``````images batch of shape [n_kernels, n_channels, img_h, img_w]
kernels batch of shape [n_kernels, n_channels, krn_h, krn_w]
``````

The cross-correlation operation works as expected, but I will run out of memory quickly: if my images’ size is 200x200px with 128 channels, if I have 50 kernels, I will need to store 200x200x128x50 floats in memory… which is huge.

Is there any way to perform my cross-correlation operations without duplicating my input image?

Thank you I’m unsure I understand the use case correctly and don’t know why repeating the image in the batch dimension would be necessary. Could you post a code snippet showing the desired results, please?

Sure, here is the code snippet I use:

``````	batch = 3
channels = 2

# Define a sample image (the **same** image is batched 3 times)
f_img = torch.zeros((batch, channels, 5, 10))
# pattern is a small 3x3 pattern for which I'm looking for with my cross-correlation
# pattern content and position are arbitrary for the tests
f_img[:, :, 2:5, 6:9] = pattern
f_img[:, :, 1:4, 1:4] = pattern * 2

# Define 3 **different** kernel
f_krn = torch.zeros((batch, channels, 3, 3))
f_krn[0, :, :, :] = pattern
f_krn[1, :, :, :] = pattern * 2
f_krn[2, :, :, :] = pattern * 3

print(f_img.shape, f_krn.shape)
# Image shape: [3, 2, 5, 10] / Kernel shape: [3, 2, 3, 3]

# Perform cross-correlation
f_img = f_img.view(1, batch * channels, f_img.shape, f_img.shape)
f_krn = f_krn.view(batch, channels, 3, 3)

result = F.conv2d(f_img, f_krn, groups=batch)

print(result)
``````

Thanks

Thanks for the code snippet.
Your approach looks valid assuming you are using different images in `f_img`.
However, based on your previous description:

it seems that `f_img` would contain only a single unique image and you are repeating it in the batch dimension to use the `view` operation with the grouped conv approach.
Your grouped conv would create 3 groups (basically splitting the channels into the “images”) and would then apply each corresponding conv kernel to the input.
If so, you should be able to use a plain convolution to get the same result, if I’m not mistaken.
This workflow would use a single image and each filter would still be applied to it.
To do so, I’ve changed your code a bit to really repeat the input image in the batch dimension and initialized the filter kernels randomly.
Could you check, if I understand your use case correctly?

``````batch = 3
channels = 2

#define a sample image (the **same** image is batched 3 times)
f_img = torch.randn((1, channels, 5, 10)).repeat(batch, 1, 1, 1)

pattern = 1.
# Define 3 **different** kernel
f_krn = torch.randn((batch, channels, 3, 3))
f_krn[0, :, :, :] = pattern
f_krn[1, :, :, :] = pattern * 2
f_krn[2, :, :, :] = pattern * 3

print(f_img.shape, f_krn.shape)
# Image shape: [3, 2, 5, 10] / Kernel shape: [3, 2, 3, 3]

# Perform cross-correlation
f_img_ = f_img.view(1, batch * channels, f_img.shape, f_img.shape)
f_krn = f_krn.view(batch, channels, 3, 3)

result = F.conv2d(f_img_, f_krn, groups=batch)

# single image
out = F.conv2d(f_img[0:1], f_krn)
print((out == result).all())
> tensor(True)
``````
1 Like

You understood the question perfectly!

In fact, I didn’t know it was possible to use a plain convolution with a different batch size between the image and the kernel. I was pretty sure that, having a batch size of 3 for my kernel, my image needed to have a batch size of 3 too.

(Just in case, your last line should be `print((abs(out - result) < 1e-6).all())` - surely because of floating precision).

Also yes, the better approach would be to use a small `eps` value or `torch.allclose`.