Gradient calculated in custom autograd function does not assigned to .grad field


I try to run will the custom autograd function as follows, which is used in mixed_operation of ProxylessNAS.

class ArchGradientFunction(torch.autograd.Function):

    def forward(ctx, x, binary_gates, run_func, backward_func):
        ctx.run_func = run_func
        ctx.backward_func = backward_func
        detached_x = detach_variable(x)
        with torch.enable_grad():
            output = run_func(detached_x)
        ctx.save_for_backward(detached_x, output)

    def backward(ctx, grad_output):
        detached_x, output = ctx.saved_tensors
        grad_x = torch.autograd.grad(output, detached_x, grad_output, only_inputs=True)
        # compute gradients w.r.t. binary_gates
        binary_grads = ctx.backward_func(,,
        return grad_x[0], binary_grads, None, None

As shown that the gradients w.r.t. binary_gates is calculated in backward. But binary_gates.grad is None when I track it by backward hooks. I know that asking questions for some specific project is not unwise choice. Since the authors close the issue block on Github, I have to ask you for some help, what could be the error of this case?

Thanks in advance!


Asking question for specific projects on the forum is fine.

You should be careful and never use .data. You should use .detach() or with torch.no_grad().

binary_gates.grad will only be populated if it’s a leaf variable and it requires gradients.
Also, the backward hook on nn.Modules (that should not be used, see the doc), run before the .grad field for its inputs is populated. So it is expected even for a leaf Tensor that you don’t see it in the hook.

1 Like