I have a stack of feature maps as given below. (my batch size is 32 and I have 64 feature maps with each with size 64x64)
out0 = self.selection[0](x)#output=[32,64,64,64]
out1 = self.selection[1](x)#output=[32,64,64,64]
out2 = self.selection[2](x)#output=[32,64,64,64]
out=torch.stack([out0,out1,out2]).permute(1, 0, 2, 3, 4)#output=[32,3,64,64,64]
and I am selecting feature maps from the second dimension (3) using a one-hot encoded vector (y, 32x3) by using torch.einsum as given below.
out=torch.einsum("x y, x y c d e -> x c d e",y,out)
Even though the model trains, when I tried to check the backpropagation updates for out0,out1,out2
, by providing data only for out0
(basically a one hot encoder tensor consists of only 1,0,0
) however the optimizer.step()
is upgrading the weights in layers corresponds to out1,out2
.(which shouldn’t be the case they are basically multiplying by zero every time.)
Anyone ever had the same problem?