# Grouped linear layer

I want to implement a linear function: y = [w_1x_1+b_1; w_2x_2+b_2;…;w_kx_k+b_k] the size of input x is (b,k*c), where b is the batch size, k is the number of groups, c is the number of channels. As shown below:

And a Grouped convolution layer is also needed. The grouped convolution layer just like using the convolution setting groups=k. The difference is that the weight and bias is not expected to be shared between groups.
Although this operation can be implemented with multiple linear or convolutional layers, I don’t think it is convenient.

Actually, I hope that different data may be processed with different weights. I want to transfer the batch to the channel, like (b, c, h, w)->(1, bxc, h, w), so that a sample corresponds to a group of channels, and each group of channels is processed with different weights. And then reshaping it back.

3 Likes

did you find an elegant way to do that ?

answering to my own question, I guess the easiest is just to go for `torch.matmul`

For `num_blocks` independent transforms, each from `dim_in` to `dim_out`, here is a workaround:

``````block_linear = torch.nn.Conv2d(
in_channels=num_blocks * dim_in,
out_channels=num_blocks * dim_out,
kernel_size=1,
groups=num_blocks,
bias=True,
)
``````

Then, for `x` of size `[batch x num_blocks x dim_in]`, the mapping looks as follows:

``````x = x.reshape(batch, num_blocks * dim_in, 1, 1)
x = block_linear(x)  # output is of size [batch x (num_blocks * dim_out) x 1 x 1]
x = x.reshape(batch, num_blocks, dim_out)
``````