I have a stack of feature maps as given below. (my batch size is 32 and I have 64 feature maps with each with size 64x64)

```
out0 = self.selection[0](x)#output=[32,64,64,64]
out1 = self.selection[1](x)#output=[32,64,64,64]
out2 = self.selection[2](x)#output=[32,64,64,64]
out=torch.stack([out0,out1,out2]).permute(1, 0, 2, 3, 4)#output=[32,3,64,64,64]
```

and I am selecting feature maps from the second dimension (3) using a one-hot encoded vector (y, 32x3) by using torch.einsum as given below.

```
out=torch.einsum("x y, x y c d e -> x c d e",y,out)
```

Even though the model trains, when I tried to check the backpropagation updates for `out0,out1,out2`

, by providing data only for `out0`

(basically a one hot encoder tensor consists of only `1,0,0`

) however the `optimizer.step()`

is upgrading the weights in layers corresponds to `out1,out2`

.(which shouldnâ€™t be the case they are basically multiplying by zero every time.)

Anyone ever had the same problem?