Parallel Layers in a single GPU

I have input tensor (x) of size (batch_size , 4 , 10). I would like to create 4 “small” separate fully connected (Linear) layers in parallel (no_inputs = 10 , no_outputs = 5).
1st linear layer will be fed by x[: , 0:1 , :]
2nd linear layer will be fed by x[: , 1:2 , :]
3rd linear layer will be fed by x[: , 2:3 , :]
4th linear layer will be fed by x[: , 3:4 , :]

I am using a single GPU for training. Is there a way to implement this in “Parallel”? So I won’t need to use a for loop as such:

output = torch.zeros(batch_size , 4 , 5)
self.layers = [nn.Linear(10 , 5) , nn.Linear(10 , 5) , nn.Linear(10 , 5) , nn.Linear(10 , 5)]
for i in range(x.shape[1]):
output[: , i : i + 1 , :] = self.layers[i](x[: , i : i+1 , :])