I’ve just only started with Pytorch. I’ve got a question about the “Defining New Autograd Function” section of one of the first tutorials in 1.7.1 docs:
This section explains how to define custom forward and backward methods, which is nice, but the presented definition doesn’t seem to connect with the training in the same example: Do I understand correctly that it doesn’t matter for the training loop in this section that the P3 function is defined as LegendrePolynomial3.apply rather than simply as
return 0.5 * (5 * x ** 3 - 3 * x)
If so, what is the advantage here of the approach with inheriting from torch.autograd.Function?
Yes, you are right, that in this example, it doesn’t matter for the training loop if the function is defined as LegendrePolynomial3.apply or just a subroutine P3. This is a simple example to show how to define a custom instantiation of autograd.Function. Note that, the elementary operators involved in this subroutine are *, **, -. Under the hood, Pytorch has already implemented with backward (or) gradient calculators for these operations.
In certain scenarios, when there is a need for some complicated custom operations to be performed (that is not defined in pytorch), typically we write those operations (both forward and backward) as a Cuda C++ op and connect python & Cuda C++ via autograd.Function.