Not fully connected layer

Hi everyone,

I would like to implement a layer looking like a fully connected one, but that has less connections. I managed to it (I think), but it seems like it is really slow. Is it possible to do this in a faster way?

Here is the code:

import torch
import torch.nn as nn

if __name__ == '__main__':
    # random tensor
    x = torch.randn(1, 1, 1024)

    # create linear layers
    size_window = 16

    nb_linear_needed = 1024 // size_window

    linear_layers = nn.ModuleList()

    for i in range(nb_linear_needed):
        linear_layers.append(nn.Linear(size_window, size_window // 2))

    # pass every part of x through all linear layers
    new_x = torch.zeros(x.shape[0], x.shape[1], x.shape[2] // 2).cuda()

    for i, linear_layer in enumerate(linear_layers):
        new_borne_inf = i * (size_window // 2)
        new_borne_sup = new_borne_inf + size_window // 2

        borne_inf = i * size_window
        borne_sup = borne_inf + size_window

        new_x[:, :, new_borne_inf:new_borne_sup] = linear_layer(x[:, :, borne_inf:borne_sup])

    print('x shape:', x.shape)
    print('new_x shape:', new_x.shape)



There was an implementation of sparse linear by huggingface:
But even with using Cutlass it loses to regular fully connected layer due to efficient optimization of the latter.

Thanks for the resource, but is there a way to control which part of the matrix is set to zero with this method?

Alternatively, I found this: python - How to efficiently implement a non-fully connected Linear Layer in PyTorch? - Stack Overflow