I’m trying to implement a custom autograd function where the output of the forward pass is the solution of an optimization problem. I’d like to solve this optimization problem with a new sub graph defined inside the autograd function, during the forward pass.

However, it seems impossible to create a sub graph in the forward. (requires_grad is automatically disabled for the child, but I don’t have any warning)
Would you know how I could fix this issue ?

I think the answer depends on your exact problem, maybe you could give more details on your optimization procedure.

Why does your optimization procedure need to be in the forward() staticmethod of some Function anyway?
You could also precompute the Jacobian of your operation in the forward pass and use a custom Function to create an output tensor (result of your optimization) which would apply that Jacobian in its backward() staticmethod.

torch.autograd.enable_grad would be necessary within your forward() staticmethod.

(Of course, my nested problem is less trivial so I can’t find gradients by hand)

I am not sure I fully understand this solution. Or I am not sure it could work this way.
Actually I am doing some bi-level optimization. The gradient return by the backward method is not obtained with the graph. (It is computed based on some optimality KKT conditions).