I am trying to write a neural network which can learn to assign scores to each channel. The following code is not able to run backprop. Here is a toy example to illustrate the problem
class ChannelScorer(nn.Module):
def __init__(self):
super(ChannelScorer, self).__init__()
self.layer1 = nn.Conv2d(in_channels=100, out_channels=200, kernel_size=3, stride=2)
self.layer2 = nn.Linear(1800, 100)
def forward(self, x):
z1 = self.layer1(x)
z1 = torch.flatten(z1, start_dim=1)
z2 = self.layer2(z1)
return z2
channel_scorer = ChannelScorer()
x = torch.rand(32, 100, 8, 8)
y = torch.rand(32, 100, 8, 8)
indices_score = channel_scorer(x)
indices = torch.topk(indices_score, 50)[1]
x[:, indices, :, :] = 0
loss = (x-y)**2
loss = torch.mean(loss)
loss.backward()
I am getting error as
97 Variable._execution_engine.run_backward(
98 tensors, grad_tensors, retain_graph, create_graph,
—> 99 allow_unreachable=True) # allow_unreachable flag
100
101
RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn
My guess is that the indexing using the topk indices is making this whole thing non-differentiable. Is there a way to circumvent this situation?