I have a Conv2d layer, and I need to get a matrix from the layer weights that, when multiplied by the input x will give me the same result as applying the layer x. I have found unfolding-based solutions applied to the input, but in my case, I would like to get the matrix for the Conv2d parameters.
In other words, I need the function to compute the to_matrix() in the code below. I imagine some flattening being necessary for x, which would be fine.
from torch.nn import Conv2d
n_channels = 2
n_out=3
kernel_size =3
batch_size=2
x_size = (4, 5)
layer = Conv2d(n_channels, n_out, kernel_size)
w = layer.weight
b = layer.bias
x = torch.rand((batch_size, n_channels)+x_size)
y = layer(conv)
my_y = torch.matmul(to_matrix(w, b), x)
print((y-my_y).max())
From what I understand, suppose you have a tensor input x with a shape of [C, H, W], and after applying convolution to x, you get a tensor y with a shape of [Cout, Hout, Wout]. At this point, the weight of the conv2D has a shape of [C_out, C*k*k] where k is the kernel size. In this scenario, it’s evident that you would need C_out*H_out*W_out*C_out*C*k*k computations and only C_out*C*k*k parameters.
If you want to vectorize the conv2D operation, it would be similar to applying a dense layer. The number of computations required would be C*H*W*C_out*H_out*W_out computations as well as parameters. I am not sure why you would want to do this, but it clearly has increased the amount significantly.
Indeed I do not want to perform the torch.matmul(to_matrix(w, b), x) computation, instead, I want to get the matrix for creating a characterization of the matrix created with the layer weights to_matrix(w, b).