Avoiding for loop in the forward method with multi head FC layers

I have multiple heads of FC layers defined with a nn.ModuleList as:

self.multihead_fc_layers = nn.ModuleList([nn.Sequential(
    nn.Linear(64, 2)
) for _ in range(512)])

The input to this layer is a tensor of size (bs, 64) where bs is the batch-size. The goal for the sub-network is to yield 512 tensors of 2 neurons so a tensor of size (bs, 512, 2). Currently, I’m managing to do this as follows:

A = []
for h in range(512):
    A.append(softmax(self.multihead_fc_layers[h](E), dim=1)[:, 1].unsqueeze(1))
A = torch.cat(A, dim=1)

where E is the input tensor of size (bs, 64). Is there a way to code this more efficiently and without the for loop?

def __init__(...):
    self.head = nn.Linear(64, 2 * 512)

def forward(....):
    A = self.head(E).reshape(-1, 512, 2).softmax(dim=-1)[:, :, 1]

But if you are doing a 2-element softmax, you might as well just use a 1-element output + sigmoid instead. I.e.,

def __init__(...):
    self.head = nn.Linear(64, 512)

def forward(....):
    A = self.head(E).sigmoid()

Thank you @SimonW for your answer. Are all of these heads trained separately?

Yeah, just the weight matrices are concatenated together as one FC. Applying multiple FCs to the same input and concatenating the results is the same as concatenating hte FCs and applying once to the input.

1 Like

Great! Thanks :slight_smile: