Forward-only Function

dashesy · June 2, 2018, 2:38am

I am trying to understand when backward method must be provided.
For example, lets say I have x and y as input (both requiring gradients). If variable x already goes through a loss function (that already computes backrpop for it), and at the same time is input to another nn.module that itself does not compute any gradients:

loss_x = first_loss(x)
loss_y = second_loss(func(x), y)
total_loss = loss_x + loss_y

Then would autograd automatically realize that x passed to func will not require gradients computed by func?
I have a custom cuda extension for func and want to see if it is Ok if I return None from its backward in this case (second_loss will return None, grad_y, is it Ok if func returns None). Here x is require_grad but not from the branch that goes to func.

This will be the only function binding for this custom extension (would it be valid, or do I need to provide a dummy backward too?):

PYBIND11_MODULE(TORCH_EXTENSION_NAME, m) {
m.def(“forward”, &rt_forward, “Target forward (CUDA)”);
}

dashesy · June 4, 2018, 9:46pm

Ok after trying a few things.

In order to make sure func would not be included in the Autograd graph, I all had to do was to pass x.data to it instead of x:

loss_y = second_loss(func(x.data), y)

This way I won’t have to think if I should return None for it backward or not, backward will simply not be called by autograd.

It is Ok if we only bind forward for a cuda extension, it works just fine