The Custom C++ and CUDA Operators tutorial says “Do not return any mutated Tensors as outputs of the operator”. How can I add autograd/backward support for the operator when the outputs are not returned (and so I do not think there will be a corresponding grad
input to the backward function)?
It seems that you need to call the .clone()
method of the mutated inputs and return those. This allows you to return them (and so have corresponding grad
inputs to the backward function) while also not violating the rule of not returning mutated Tensors (as you are returning clones rather than the actual mutated Tensors).