Extending PyTorch Example

I looked at the custom backward example for extending pytorch. My question is that how we can access grad_input, grad_weight, grad_bias or any other returned variable from the backward function after calling loss.backward() in this example?

# Inherit from Function
class LinearFunction(Function):

    # Note that both forward and backward are @staticmethods
    # bias is an optional argument
    def forward(ctx, input, weight, bias=None):
        ctx.save_for_backward(input, weight, bias)
        output = input.mm(weight.t())
        if bias is not None:
            output += bias.unsqueeze(0).expand_as(output)
        return output

    # This function has only a single output, so it gets only one gradient
    def backward(ctx, grad_output):
        # This is a pattern that is very convenient - at the top of backward
        # unpack saved_tensors and initialize all gradients w.r.t. inputs to
        # None. Thanks to the fact that additional trailing Nones are
        # ignored, the return statement is simple even when the function has
        # optional inputs.
        input, weight, bias = ctx.saved_tensors
        grad_input = grad_weight = grad_bias = None

        # These needs_input_grad checks are optional and there only to
        # improve efficiency. If you want to make your code simpler, you can
        # skip them. Returning gradients for inputs that don't require it is
        # not an error.
        if ctx.needs_input_grad[0]:
            grad_input = grad_output.mm(weight)
        if ctx.needs_input_grad[1]:
            grad_weight = grad_output.t().mm(input)
        if bias is not None and ctx.needs_input_grad[2]:
            grad_bias = grad_output.sum(0)

        return grad_input, grad_weight, grad_bias


Could you explain why you want to access these?
Also these are regular python functions so you can simply store them in any structure in the parent scope.

In my actual example, I would like to copy the gradients to cpu (after computing them on gpu) and then access the cpu copies of gradients after loss.backward().

In that case, you can store them in the parent scope to access them later.
Or use a hook on the Tensors directly as t.register_hook() to have a function that is called once the gradient for that Tensor is computed.