-
I am trying to understand when
backward
method must be provided.
For example, lets say I have x and y as input (both requiring gradients). If variable x already goes through a loss function (that already computes backrpop for it), and at the same time is input to another nn.module that itself does not compute any gradients:loss_x = first_loss(x)
loss_y = second_loss(func(x), y)
total_loss = loss_x + loss_y
Then would autograd automatically realize that x passed to func
will not require gradients computed by func?
I have a custom cuda extension for func
and want to see if it is Ok if I return None
from its backward
in this case (second_loss
will return None, grad_y
, is it Ok if func
returns None
). Here x is require_grad
but not from the branch that goes to func
.
-
This will be the only function binding for this custom extension (would it be valid, or do I need to provide a dummy backward too?):
PYBIND11_MODULE(TORCH_EXTENSION_NAME, m) {
m.def(“forward”, &rt_forward, “Target forward (CUDA)”);
}