I have multiple heads of FC layers defined with a nn.ModuleList as:

self.multihead_fc_layers = nn.ModuleList([nn.Sequential(
nn.Linear(64, 2)
) for _ in range(512)])

The input to this layer is a tensor of size (bs, 64) where bs is the batch-size. The goal for the sub-network is to yield 512 tensors of 2 neurons so a tensor of size (bs, 512, 2). Currently, I’m managing to do this as follows:

A = []
for h in range(512):
A.append(softmax(self.multihead_fc_layers[h](E), dim=1)[:, 1].unsqueeze(1))
A = torch.cat(A, dim=1)

where E is the input tensor of size (bs, 64). Is there a way to code this more efficiently and without the for loop?

Yeah, just the weight matrices are concatenated together as one FC. Applying multiple FCs to the same input and concatenating the results is the same as concatenating hte FCs and applying once to the input.