Simple autograd fail

Hi guys, very new to PyTorch but have been using successfully on a few small projects until I hit a problem which is summarised in the MWE below. Should this not be returning 6? I am missing something here but am unsure why, and I am certain it is simple.

import torch
p = torch.tensor([1.], requires_grad=True)
class Square(torch.autograd.Function):
    def forward(ctx, target):
        ir = torch.pow(target, 2.)
        return ir

    def backward(ctx, u):
        target, = ctx.saved_tensors
        return target * 2.

sq = Square.apply(p)
six = sq.pow(3.)

This returns tensor([2.]), which is clearly the derivative from square, so why is it not also applying the cube part?

Thank you!!

Because you are not using any gradient value from any operation after you custom autograd function.

If you check your backward method, you have a argument u which contains all grads from operations after Square. Because you are not using it, no matter what you do, it just breaks the all backwards edges right there.

What you are computing:


What you should compute:


So overally:

To fix this, just incorporate u into your computation of backward:

return u * (target * 2.)


Of course! Thank you, and nice explanation

1 Like