Given a feature map x with size of n*c*w*h, For each sample n, I want to set some specific channels to zero.
I multiply X by a Mask, but the accuracy is not satisfactory.
def forward(self, x):
mask = Variable(torch.ones(x.size()))
mask = mask.cuda()
x = x * mask
x = F.avg_pool2d(x, x.size()[2:])
x = x.view(x.size(0), -1)
x = self.classifier(x)
x = x * mask does not change the value of x, why I get a lower accuracy than I did not do this multiplication operation?
Is there anything wrong in the backward of training?