I am interested in computing f’(x) where f is an ordinary pytorch computation. The thing is that I don’t really care about f(x), I only need the gradient. I could write the computation by hand but that’s pretty tedious. Is using pytorchs autograd-framework efficient for these sort of things? f’(x) is inside a torch.no_grad(), so I am not interested in the gradient of the gradient-computation.

I have a few use-cases, but one is that I need the gradient for |1-Var(x)| for x where the variance is computed over the batch-dim.