Tracing back from output to input - accessing internal buffers


I am trying to re-implement a technique for visualization, something similar to cnn-fixation. It’s a simple solution; Taking the maximum probability from output and manually trace it back to find evidence in the input. I think the best way to understand it is just to look at the figures of the paper:

Anyways, I didn’t have success with backward_hooks but I managed to reproduce the function graph by traversing the grad_fn from the output Tensor and I also can store the intermediate activation maps using forward_hooks. However, I don’t have the variable(Tensor) graph. In order to trace back from the output, something very similar to backward needs to happen. The problem here is that I don’t know the inputs of each node in the function graph; I have the outputs and weights but I don’t know the input. I was hoping to use activation_maps from the forward hooks but there’s no ID or name that can help connect or overlay two graphs( function & tensor)

Let’s make the question more specific:

1 - Is there a way to access internal buffers of a module? (._buffers seems to be always empty).
2 - Can anyone suggest me a better solution to solve this?