say I have an input tensor x and my model contains multiple heads where each head processes x separately. Afterwards, the outputs shall be concatenated. If the heads were linear layers this would not be a problem as I could model this as a single linear layer. However, the heads are convolutional networks. So far I have tried putting all the heads in a ModuleList and then iterating over the ModuleList, each time appending the new result to some existing tensor. However, this is really slow. Are there better approaches to do this?