create_graph=True in backward() causes memory leakage

Hi, I meet the same problem, but I want to backward on the first derivative w.r.t. network parameters. Since both torch.autograd.grad and torch.autograd.functional.jacobian only takes vector inputs while network parameters are tuple of tensors, is there a feasible way to do this? Thanks in advance for any possible help!

This post: Get gradient and Jacobian wrt the parameters helps get jacobian but I’m trying to backward further on Jacobian. It would be really nice if PyTorch supports gradient w.r.t PyTree objects like jax.