Change in Function behaviour?

I have a Function whose only job is to negate its gradient. After upgrading to 0.20 I realize the backward function is no longer being called. Has there been a change in the behaviour of the computational graph?

class NegateGradient(Function):
    """ forward identity, backward negation, useful in minimax """
    def forward(self, input):
        return input
    def backward(self, grad_output):
        return -grad_output

If I change return input to return input+1e-10, then the backward function would be correctly called again. If this is some newly introduced graph optimization, can someone please point me to the documentation?

The error comes from the fact that since the output of your forward pass shares the same storage as the input, you should mark the input as “dirty” with the mark_dirty() function: add self.mark_dirty(input) in your forward pass.
You can also change this to use the new style for Function:

class NegateGradient(Function):
    """ forward identity, backward negation, useful in minimax """
    def forward(ctx, input):
        return input

    def backward(ctx, grad_output):
        return -grad_output

Be careful though that the new style functions (that use static methods) are used slightly differently: you should use the .apply() method instead of just calling an instance with the arguments of your forward pass.
In your particular case you should change neg_grad_b = NegateGradient()(b) to neg_grad_b = NegateGradient.apply(b).

1 Like

Tried to replicate this

neg = NegateGradient.apply

but got

AttributeError: type object ‘NegateGradient’ has no attribute ‘apply’


NegateGradient.__dict__ gives:

dict_proxy({‘doc’: ’ forward identity, backward negation, useful in minimax ',
module’: ‘main’,
‘backward’: <staticmethod at 0x111a5e440>,
‘forward’: <staticmethod at 0x111a5e398>})

You will need to use pytorch >= 0.2 for this to work.


Thanks! Didn’t realize I was actually using an in-place function lol. Strange though I remember backward getting called in 0.12

The autograd engine has been refactored in 0.2 so maybe this particular case was working (even though it was not guaranteed) before the refactor.

What would happen if one does that? Should it break or the behaviour would be different from the one we expect? Because I use 0.2 and have multiple calls like this in my code.

Old style Functions should be called that way, if you try to call a new style Function this way, it will raise an error.

using return input.view_as(input) also works

In this case yes, but the engine does not know that the input is shared, and this can cause subtle bugs in some cases, so you should use mark_dirty().

If I change the NegateGradient as follows:

class NegateGradient(Function):
    def forward(ctx, input, a):
        ctx.a = a

        return x

    def backward(ctx, grad_output):
        output = -grad_output * ctx.a

        return output, None

It raises error: one of the variables need for gradient computation has been modified by an inplace operation, can you explain it?

That means that somewhere when you use this function, you perform inplace operations either on the input or the output. And since you use the mark_dirty() function, the autograd engine is now able to detect this problem and raise an error.

I see, thank you again