I am trying to write a neural network which can learn to assign scores to each channel. The following code is not able to run backprop. Here is a toy example to illustrate the problem
class ChannelScorer(nn.Module): def __init__(self): super(ChannelScorer, self).__init__() self.layer1 = nn.Conv2d(in_channels=100, out_channels=200, kernel_size=3, stride=2) self.layer2 = nn.Linear(1800, 100) def forward(self, x): z1 = self.layer1(x) z1 = torch.flatten(z1, start_dim=1) z2 = self.layer2(z1) return z2 channel_scorer = ChannelScorer() x = torch.rand(32, 100, 8, 8) y = torch.rand(32, 100, 8, 8) indices_score = channel_scorer(x) indices = torch.topk(indices_score, 50) x[:, indices, :, :] = 0 loss = (x-y)**2 loss = torch.mean(loss) loss.backward()
I am getting error as
98 tensors, grad_tensors, retain_graph, create_graph,
—> 99 allow_unreachable=True) # allow_unreachable flag
RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn
My guess is that the indexing using the topk indices is making this whole thing non-differentiable. Is there a way to circumvent this situation?