Forward pass through different branches

Instead of iterating each row in out, you could split out before into both inputs, call fc_1 and fc_2, and create the output tensor using the results.
Depending on the batch size, the performance difference might be insignificant.

1 Like