Hi All,

We try to run multiple functional Conv2D layers in parallel. For example, we have tensor of weights `weights = torch.rand(10,12,3,5,5)`

where each axis represents `(number_of_conv2d_layers, out_channels, in_channels, kernel_H, kernel_W)`

. In addition, we have a input `x = torch.rand(20,3,28,28)`

where each axis represents `(batch, channels, H, W)`

. We want to apply each entry of `weights`

on `x`

. We were able to implement it using for loop in the following manner:

```
class ForConv2d(torch.nn.Module):
def __init__(self):
super().__init__()
def forward(self, x, weights):
new_x = []
for w in weights:
new_x.append(torch.nn.functional.conv2d(input=x, weight=w))
return torch.stack(new_x)
```

We are sure there is a more natural way for this implementation, avoiding the for loop (maybe Conv3D).

Thanks in advance.