# Block squeezing: simplify sequence of convolutional layers

Hi guys,

I’m trying to reparametrize a sequence of convolutional layers into a single convolutional layer, but I’m running into some problems.

Let me explain better what’s the goal: given a random tensor `x`, two convolutional layers (`conv1`, `conv2`), I want to reparametrize them into a single convolutional layer (`conv_rep`), such that: `conv2(conv1(x)) == conv_rep(x)`

The reparametrize convolutional kernels function:

``````import torch
from torch import Tensor

def reparametrize_conv_kernels(k1: Tensor, k2: Tensor) -> Tensor:
"""
Reparametrize convolutional kernels into a single one.

Reference: https://arxiv.org/pdf/2204.00826.pdf (Figure 4a)

Parameters
----------
k1
tensor of shape (ch_out1, ch_in1, ks1, ks1)
k2
tensor of shape (ch_out2, ch_out1, ks2, ks2)

Returns
-------
Tensor of shape (ch_out2, ch_in1, ks1+ks2-1, ks1+ks2-1)
"""
k1 = k1.permute(1, 0, 2, 3)
k2 = k2.flip(-1, -2)

``````

An example:

``````ch_out1, ch_in1, ks1 = (3, 5, 3)
ch_out2, ch_in2, ks2 = (1, 3, 3)

k1 = torch.randn(ch_out1, ch_in1, ks1, ks1)
k2 = torch.randn(ch_out2, ch_in2, ks2, ks2)

batch_size, image_height, image_width = 1, 6, 6
x = torch.randn(batch_size, ch_in1, image_height, image_width)

k3 = reparametrize_conv_kernels(k1, k2)

torch.testing.assert_allclose(out2, out12)
``````

The test fails raising the following error:

``````AssertionError: Tensor-likes are not close!
E
E       Mismatched elements: 20 / 36 (55.6%)
E       Greatest absolute difference: 30.521246433258057 at index (0, 0, 1, 5) (up to 1e-05 allowed)
E       Greatest relative difference: 3.333448777518062 at index (0, 0, 0, 1) (up to 0.0001 allowed)
``````

I’m not sure about the math theory behind the whole process (any useful references are really appreciated), and in general I haven’t understood yet how to generalize this procedure with various settings like kernel sizes, padding, groups etc.

Does the `rep_conv` need to have `kernel_size` as indicated above, or are there any alternatives?

Any suggestions?

I think the mismatch is caused by the padding in `conv2`. If you remove the padding or use:

``````padding_conv1 = 2
you should get (approx.) the same results. The padding in `conv1` would cause a different “border” output in the intermediate results (`out1`) and will thus create the mismatches at the border.
Supporting only some convolutional layer parameters is a great limitation: for instance, the majority of the convolutional layers used in CNN architectures have `padding = 1`. Thus, a block with two sequential conv3x3 cannot be reparametrized.