create_graph=True in backward() causes memory leakage

Huimin_ZENG · January 15, 2020, 10:30pm

Hi! I was trying to do backward on the first derivative (Jacobian). I observed that the usage of memory continues to grow if y.backward(retain_graph=True,create__graph=True) .

I have read this post https://discuss.pytorch.org/t/how-to-free-the-graph-after-create-graph-true/58476/4, where it is said the graph will be deteted if the referece is deleted.

But I also found this post: https://github.com/pytorch/pytorch/issues/4661, stating that the leakage issue is still open.

I am confused. Could you please help me out? And I can’t use torch.autograd.grad, since my outputs y are vectors, not scalar outputs.

Thanks!

albanD · January 15, 2020, 11:09pm

Hi,

The conclusion from the issue you linked is that this is expected behavior mostly (or something we should forbid people from doing).
torch.autograd.grad works for vectors as well. What is the issue you encounter when trying to use it?

Huimin_ZENG · January 15, 2020, 11:22pm

Ah! I see! So I should stick to torch.autograd.grad, right?

I just figured out how to use torch.autograd.grad for vectors couple minutes ago.

Thanks!

albanD · January 16, 2020, 2:51pm

Yes you should stick to torch.autograd.grad and all will be good.

ybj14 · November 18, 2020, 6:59pm

Hi, I meet the same problem, but I want to backward on the first derivative w.r.t. network parameters. Since both torch.autograd.grad and torch.autograd.functional.jacobian only takes vector inputs while network parameters are tuple of tensors, is there a feasible way to do this? Thanks in advance for any possible help!

This post: Get gradient and Jacobian wrt the parameters helps get jacobian but I’m trying to backward further on Jacobian. It would be really nice if PyTorch supports gradient w.r.t PyTree objects like jax.

albanD · November 18, 2020, 7:15pm

Hi,

Both these functions take either a single Tensor or a tuple of Tensors as input. So it should work just fine.