My understanding is that to get autograd work, we will also need to register the corresponding backward function. However, I could not find how this works. I checked PyTorch source code (torch/csrc/autograd/generated/VariableTypeEverything.cpp) and could not find where backward functions are registered either.
You implement it in C++ similarly to in Python via autograd.Function. You then have to register an op that uses your autograd function. See this example
#include <torch/script.h>
#include <torch/all.h>
#include <iostream>
#include <memory>
using namespace at;
using torch::Tensor;
using torch::autograd::AutogradContext;
using torch::autograd::Variable;
using torch::autograd::variable_list;
// computes f(x) = 2x
class MyDouble : public torch::autograd::Function<MyDouble> {
public:
static variable_list forward(
AutogradContext* ctx,
Variable input) {
return {input + input};
}
static variable_list backward(
AutogradContext* ctx,
variable_list grad_output) {
return {torch::ones({2, 2}) + 1};
}
};
Tensor double_op(const Tensor& input) {
return MyDouble::apply(input)[0];
}
static auto registry =
torch::RegisterOperators("my_ops::double_op", &double_op);
then you’ll notice that the gradients between the torch-only function and the composite torchscript + pytorch are different. The torch-only function has the correct gradients.