Given a feature map x with size of ncw*h, For each sample n, I want to set some specific channels to zero.
I multiply X by a Mask, but the accuracy is not satisfactory.
def forward(self, x): mask = Variable(torch.ones(x.size())) mask = mask.cuda() x = x * mask x = F.avg_pool2d(x, x.size()[2:]) x = x.view(x.size(0), -1) x = self.classifier(x) return x
x = x * mask does not change the value of x, why I get a lower accuracy than I did not do this multiplication operation?
Is there anything wrong in the backward of training?