The thing is, if someone tries to access higher-order derivatives via double backward, then it must be that there’s a grad_output in the backward param list that requires grad. However, that’s often not the case when someone tries to encapsulate fancy/complicated logics in the forward/backward implementation, where the only params in the backward call are anyways irrelevant to higher order derivative computations. This makes the once_differentiable
decorator pretty much useless. E.g. I have given an example in the main post that it doesn’t stop wrong Hessian computation. Another example is the post above, which I also believe it does nothing, because the only access exposed to the outside is grad_output, which are always empty in higher order derivative computations.