How to implement multiple parallel layers with different act. functions

Hello all. I keep getting stuck over how to implement a very simple 2 layer full-connected network where the first layer is actually 50 layers in parallel. Each input is fed to only one neuron in the first “layer”, which have different nonlinearities. The outputs of all the neurons of the first layers are then passed to the second (output) layer. I am not sure what is the canonical way to implement this as i keep ending up with mismatch errors. Here is example sketch implementation:

     class Net(nn.Module):
      def __init__(self):
        super(Net, self).__init__()
        # Yes, it's 1x1 ... dont ask why
        self.first_layers = [nn.Linear(1, 1)  for di in range(L1_NEURONS) ]

        self.second_layer = nn.Linear(L1_NEURONS, 1)

        self.outs  = numpy.zeros((L1_NEURONS))

      def forward(self, xinput):
        sums = [ self.first_layers[di](xinput) for di in range(L1_NEURONS) ]
        if (di < 30):
                    self.outs[di] = torch.tanh(sums[di])
        else:
                    self.outs[di] = torch.relu(sums[di])
        return self.second_layer( self.outs)

My xinput shape is (batch_size, L1_NEURONS) . How should i shape my xinput and self.outs to fit this scheme ? Sorry for the newb question, i m looking for a way to avoid iterating over each batch sample. Thanks!

You can use torch — PyTorch 2.1 documentation

  1. Split xinput into xinput1 and xinput2 using torch.split
  2. Apply different non-linearity on xinput1 and 2 respectively
  3. Use torch.cat to concatenate