Here, I’m finding the derivative of the output “true class” w.r.t. input pixels.
Now, I want to use ‘guided backpropagation’ i.e. taking max(grad, 0) at each layer before passing the gradient to previous layer during backward pass. The documentation of register_backward_hook is not very detailed.
I want something like
def hookfunc(model, gi, go):
grads = gi > 0
return grads
h = model.register_backward_hook(hookfunc)
I tried doing this, but this hook callback function never runs. I think that is because I’m running .backward() on an element of the output Variable, and not on the model. I’m not sure about this, but the documentation isn’t making anything clear.
To clarify, I want the hook callback function to run for every layer during back pass. Any help on how to do this would be appreciated.
I don’t think the hook will work as expected (see this recent thread for a discussion of the values of the hook Exact meaning of grad_input and grad_output).
If you write the model yourself, you could just use hidden.register_hook(lambda grad: grad.clamp(min=0)), similar to the gradient clipping discussed here) on your activations between the layers.
First of all, thanks for the awesome explanation of gradInput and gradOutput. I was trying to figure out what they were, but couldn’t find it anywhere in the docs.
I’ve found a workaround for this. It’s not an elegant solution, but it works.
def saliency_map_general(model, input, label):
if (isinstance(model, torchvision.models.Inception3)):
input = preprocess_inception(input)
else:
input = preprocess(input)
h = [0]*len(list(model.modules()))
def hookfunc(module, gradInput, gradOutput):
print('hook callback is running')
// do something here
for j, i in enumerate(list(model.modules())):
h[j] = i.register_backward_hook(hookfunc)
output = model.forward(input)
model.zero_grad()
output[0][label].backward()
for i in range(len(list(model.modules()))):
h[i].remove()
grads = input.grad.data.clamp(min=0)
grads.squeeze_()
grads.transpose_(0,1)
grads.transpose_(1,2)
grads = np.amax(grads.cpu().numpy(), axis=2)
return grads
Can someone guide me as to how I should clamp the gradients to 0? (where I wrote “// do something here” in the code).
gradInput and gradOutput are both tuples, so I’m not able to do any operations on them.
I cannot convert gradInput to type “list” because it takes way too much time to do that (I grew impatient and restarted the jupyter notebook after a few minutes).
Does return [(None if g is None else g.clamp(min=0)) for g in gradInput] look vaguely good? (I didn’t try, but I would hope the gradients are either None or a variable.)