Forward-only Function

  1. I am trying to understand when backward method must be provided.
    For example, lets say I have x and y as input (both requiring gradients). If variable x already goes through a loss function (that already computes backrpop for it), and at the same time is input to another nn.module that itself does not compute any gradients:

    loss_x = first_loss(x)
    loss_y = second_loss(func(x), y)
    total_loss = loss_x + loss_y

Then would autograd automatically realize that x passed to func will not require gradients computed by func?
I have a custom cuda extension for func and want to see if it is Ok if I return None from its backward in this case (second_loss will return None, grad_y, is it Ok if func returns None). Here x is require_grad but not from the branch that goes to func.

  1. This will be the only function binding for this custom extension (would it be valid, or do I need to provide a dummy backward too?):

    PYBIND11_MODULE(TORCH_EXTENSION_NAME, m) {
    m.def(“forward”, &rt_forward, “Target forward (CUDA)”);
    }

Ok after trying a few things.

  1. In order to make sure func would not be included in the Autograd graph, I all had to do was to pass x.data to it instead of x:

    loss_y = second_loss(func(x.data), y)

This way I won’t have to think if I should return None for it backward or not, backward will simply not be called by autograd.

  1. It is Ok if we only bind forward for a cuda extension, it works just fine