Question about tensors in autograd function

I have this autograd function.

class raw(autograd.Function):
    @staticmethod
    def forward(ctx, inp):
        ctx.a = inp * inp + 1
        print(inp.requires_grad, ctx.a.requires_grad)
        return inp * ctx.a.pow(-0.5)

    @staticmethod
    def backward(ctx, grad_output):
        return grad_output * ctx.a.pow(-1.5)

when I ran this function like raw.apply(tc.randn(1, requires_grad=True)), the output is True False. In this case, ctx.a does not require grad. But don’t tensors that require grad produce tensors that also requires grad? For example

>>> a=tc.randn(1,requires_grad=True)
>>> b=a*a+1
>>> b.requires_grad
True

I’m quite confused about what is happening during the process of calling raw.apply(I don’t know where to find the implementation of .apply either).
Can someone explain this process to me?
Thank u very much.

Hi,

The requires_grad field is used to track operation and Tensors that require gradients. Here since you are inside a autograd.Function. You do no use the autograd anymore to get gradients, you code the backward yourself. So the Tensors that are created in during the forward pass do not require gradients as the autograd engine does not need to track operations on them to compute gradients.

Thanks for ur reply, that solved my problem.
Just another question, do u know where is the source code of the apply of autograd functions?

Sure, it is here. The part that disable the gradient is the GradMode(false); that is equivalent to with torch.no_grad() in python.
Let me know if you have more question about this !

1 Like

Thanks! Then is there a way implement an autograd function in C++ and use it directly in Python(not writing the forward and backward separately and wrap it in Python)?

You can use the cpp api equivalent: torch::autograd::Function. And you will need to bind the ::apply() method properly to python.

EDIT: You can find here an example of the cpp Function. You can then bind its apply method just like any function.

Thanks. I tried your suggestion, but I got the following compiling error. I used a function based on the one in your example, and the C++ apply worked fine.
Here’s my code for binding the methods:

namespace py = pybind11;
PYBIND11_MODULE(actv, m) {
	py::class_<CPTanh>(m, "CPTanh")//line 25
		.def_static("forward", &CPTanh::forward)
		.def_static("backward", &CPTanh::backward)
		.def_static("apply",&CPTanh::apply); //line 28

}

Here’s the error:

actv.cpp(29): error C2672: 'pybind11::class_<CPTanh>::def_static': no matching overloaded function foun
d
actv.cpp(26): error C2783: 'pybind11::class_<CPTanh> &pybind11::class_<CPTanh>::def_static(const char *
,Func &&,const Extra &...)': could not deduce template argument for 'Func'

It seems that apply can’t be bind in the common way, do u have any idea about this(I’m quite new in C++)?

You don’t need to bind the whole class in python. Just the apply function will work. And if you can’t bind CPTanh::apply you can create a local function that then calls it.
Just like the python version, you should never call forward and backward yourself nor instantiate the class. Only the static method apply is needed.

Thanks, that solved the problem.