Weight sharing perceptrons

I want to implement a network having weight-sharing perceptrons.
If I use structures like below

class ABC(nn.Module):
    def __init__(self, n, ch_in, d):
        super(ABC, self).__init__()
        self.n = n
        self.ch_in = ch_in
        self.d = d
        self.linear = nn.Linear(self.ch_in, self.d)
    def forward(self, x):        
        outt = []        
        for i in range(self.n):
            outt.append(self.linear( x[:, self.ch_in*i:self.ch_in*(i+1)]))
        out = torch.cat(outt, dim=1)
        return out

How will be the performance in GPU?
Should I avoid using lists?

When loops and lists are replaced by conv layers the processing speed becomes 3 times faster in my trials.