I have kind of a hard derivative to calculate with autograd: I want the importance of an input pixel (so derivative of the loss wrt that pixel) derived wrt a specific weight/layer.
So basically d(dL/dI)/dw. How do I make sure autograd understands this? When I just try to derive it like so:
With self.gradients being the gradient of the loss wrt the input and layer.weight the weights of a layer, I get the worning that self.gradients has an empty computation graph, even though I compute it like this:
self.model.eval() # Set the model to evaluation mode
input_data.requires_grad = True # Set requires_grad to True to compute gradients
self.label = label
if input_data.grad is not None:
input_data.grad.zero_()
# Forward pass
# print("for the real image the size is "+str(input_data.size()))
outputs = self.model(input_data)
if self.last_layer_linear:
self.activations["output"]=outputs
self.input = input_data
target = torch.zeros(outputs.size(), dtype=torch.float)
target[0][label] = 1.0
self.target = target
self.loss = self.default_loss(outputs, target)
# Backpropagate to compute gradients with respect to the output
self.loss.backward(retain_graph=True)
# Get the gradients with respect to the input
self.gradients = input_data.grad.clone().detach()
self.gradients.requires_grad = True
If anyone has any ideas I would be very happy!