Two backward hooks cannot access the same dict-object

When registering two nn.Sigmoid() backward hooks that print saved activations from a forward pass, saved in a dict, I encounter no problems. However, as soon as I make one of the backward hooks return a new grad_input, the other backward hook gives an error when trying to access the activation dict. From what does this error stem? And how should I access forward activations, if I cannot use a dict?

Network

class HookNet(nn.Module):
    def __init__(self):
        super(HookNet, self).__init__()
        self.fc1 = nn.Linear(2,2)
        self.s1 = nn.Sigmoid()
        self.fc2 = nn.Linear(2,1)
        self.s2 = nn.Sigmoid()
        self.fc1.weight = torch.nn.Parameter(torch.Tensor([[1, 2],[-1, 2]]))
        self.fc1.bias = torch.nn.Parameter(torch.Tensor([0]))
        self.fc2.weight = torch.nn.Parameter(torch.Tensor([[1, 2]]))
        self.fc2.bias = torch.nn.Parameter(torch.Tensor([0]))
        
    def forward(self, x):
        x= self.fc1(x)
        x = self.s1(x)
        x= self.fc2(x)
        x = self.s2(x)
        return x
hooknet = HookNet()

Saving forward activations and printing them in the backward pass

saved_activations = {}
def forward_save_act(name, module, input, output):
    saved_activations[name] = (input[0].data, output.data)

def backward_use_act(name, module, grad_input, grad_output):
    print('___Backward pass for '+str(name)+'___')
    input, output = saved_activations[name]
    print('Saved Input: '+str(input))
    print('Saved Output: '+str(output))
    print('Grad_input needed to be overwritten: '+str(grad_input))
    new_grad_input = output/input
    grad_tuple = (new_grad_input.data*grad_output[0],)
    print('New grad input: '+str(grad_tuple))

from functools import partial
for name, m in hooknet.named_modules():
    if type(m) == nn.Sigmoid:
        m.register_forward_hook(partial(forward_save_act, name))
        m.register_backward_hook(partial(backward_use_act, name))

This yields good results, and during the backward pass, the hooks are not troubled accessing saved_activations and printing them. An example:

inp = torch.Tensor([1, 1])
inp.requires_grad=True
out = hooknet(inp)
out.backward()

Output: [0.91794074]
Backward pass for s2
Saved Input: tensor([2.4147])
Saved Output: tensor([0.9179])
Grad_input needed to be overwritten: (tensor([0.0753]),)
New grad input: (tensor([0.3801]),)
Backward pass for s1
Saved Input: tensor([3., 1.])
Saved Output: tensor([0.9526, 0.7311])
Grad_input needed to be overwritten: (tensor([0.0034, 0.0296]),)
New grad input: (tensor([0.0239, 0.1101]),)

However, if I return the grad_input calculated, I encounter problems for the backward pass for s1. The interesting thing is, that the error is affiliated with the loading from the dict.

def backward_use_act(name, module, grad_input, grad_output):
    print('___Backward pass for '+str(name)+'___')
    input, output = saved_activations[name]
    print('Saved Input: '+str(input))
    print('Saved Output: '+str(output))
    print('Grad_input needed to be overwritten: '+str(grad_input))
    new_grad_input = output/input
    grad_tuple = (new_grad_input.data*grad_output[0],)
    print('New grad input: '+str(grad_tuple))
    return(grad_tuple)

from functools import partial
for name, m in hooknet.named_modules():
    if type(m) == nn.Sigmoid:
        m.register_forward_hook(partial(forward_save_act, name))
        m.register_backward_hook(partial(backward_use_act, name))
        
inp = torch.Tensor([1, 1])
inp.requires_grad=True
out = hooknet(inp)
out.backward()

Backward pass for s2
Saved Input: tensor([2.4147])
Saved Output: tensor([0.9179])
Grad_input needed to be overwritten: (tensor([0.0753]),)
New grad input: (tensor([0.3801]),)
Traceback (most recent call last):

File “”, line 49, in
out.backward()

File “/Users/Eigil/opt/anaconda3/lib/python3.7/site-packages/torch/tensor.py”, line 195, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph)

File “/Users/Eigil/opt/anaconda3/lib/python3.7/site->packages/torch/autograd/init.py”, line 99, in backward
allow_unreachable=True) # allow_unreachable flag

File “”, line 28, in backward_use_act
print(‘Backward pass for '+str(name)+'’)

SystemError: <class ‘str’> returned a result with an error set

If I hardcode the grad_tuples rather than loading them from the dict (or pass them as an additional argument through partial) I encounter no problems.

Hi,

Is it really related to the dict? If you just return a constant it behaves differently?

Whoops, I guess not. That’s a little embarrasing, should I change the title of the post?
If I use the alternative backward hook function

def backward_use_act(name, module, grad_input, grad_output):
    print('___Backward pass for '+str(name)+'___')
    print('Grad_input needed to be overwritten: '+str(grad_input))
    grad_tuple = (grad_input[0],)
    print('New grad input: '+str(grad_tuple))
    return(grad_tuple)

I also get an error, when the backprop reaches s1

Backward pass for s2
Grad_input needed to be overwritten: (tensor([0.0753]),)
New grad input: (tensor([0.0753]),)
Traceback (most recent call last):

File “”, line 46, in
out.backward()

File “/Users/Eigil/opt/anaconda3/lib/python3.7/site-packages/torch/tensor.py”, line 195, >in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph)

File “/Users/Eigil/opt/anaconda3/lib/python3.7/site->packages/torch/autograd/init.py”, line 99, in backward
allow_unreachable=True) # allow_unreachable flag

File “”, line 28, in backward_use_act
print(‘Backward pass for '+str(name)+'’)

SystemError: <class ‘str’> returned a result with an error set

However, I get no error, if I don’t include the return-call:

def backward_use_act(name, module, grad_input, grad_output):
    print('___Backward pass for '+str(name)+'___')
    print('Grad_input needed to be overwritten: '+str(grad_input))
    grad_tuple = (grad_input[0],)
    print('New grad input: '+str(grad_tuple))

Backward pass for s2
Grad_input needed to be overwritten: (tensor([0.0753]),)
New grad input: (tensor([0.0753]),)
Backward pass for s1
Grad_input needed to be overwritten: (tensor([0.0034, 0.0296]),)
New grad input: (tensor([0.0034, 0.0296]),)

I do not understand this behavior. In this instance, the return-objects are identical to the computed grad_input, but errors are still raised?

I am not sure why this happens…
Maybe due to the partial that you use to give the name?

Also note that we currently discourage the use of the backward_hook() on the nn.Module (or at least be aware that they can return incorrect results) see the doc for more details.

It might be due to partial, however it seems like the name is passed without trouble (It outputs the right name, when prompted and appears to load the right activations).

I am aware of the discouragement, however for non-linear modules such as nn.Sigmoid, nn.Tanh and nn.ReLU, I hypothesized that due to a very simple input-output relationship (with only one node in the computation graph), the ambiguity of grad_input and grad_output disappeared, unlike for nn.Linear modules, where the gradients could be with respect to both weights, biases or earlier layers. It seems like I might have been mistaken, though.

If backward hooks are broken, are there any other options for changing the calculated gradients of the non-linearities?

Hi,

You can use the Tensor hooks directly that will work fine.
In particular, if you add a hook to the input of the sigmoid, you will be able to change the gradient that flow back from there.

I was mentionning the name because the error seems to point to it:

print(‘Backward pass for '+str(name)+'’)

SystemError: <class ‘str’> returned a result with an error set

Thank you for your help, Alban.

I encounter errors even without the print statements, so I don’t think the name is inflicting the error:

def backward_use_act(name, module, grad_input, grad_output):
    grad_tuple = (grad_input[0],)
    return(grad_tuple)

Traceback (most recent call last):

File “”, line 42, in
out.backward()

File “/Users/Eigil/opt/anaconda3/lib/python3.7/site-packages/torch/tensor.py”, line 195, >in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph)

File “/Users/Eigil/opt/anaconda3/lib/python3.7/site->packages/torch/autograd/init.py”, line 99, in backward
allow_unreachable=True) # allow_unreachable flag

SystemError: PyEval_EvalFrameEx returned a result with an error set

For using the tensor hook (register_hook), how would I access the tensor, that gets inputted to the Sigmoid? This tensor gets created by the weights, biases and activations in a nn.Linear module, how do I access the resulting Tensor in my example?

The current workaround would be to use a register_forward_hook() on the nn.Module as you were doing. And in this forward hook, add a Tensor register_hook() on the inputs (or outputs) that are of interest.

This looks very surprising. Can you try to get the smallest possible example that reproduces this so that we can open a bug report on github?

Did anyone found the solution? I got the same error while using .backward() to get the gradient

Which error? Can you share the full error stack/message and if possible a code sample to reproduce it?