So retrain_graph is used for the second time backward?

and create_graph is used for higher order derivative of graph parameters?

2 Likes

I’m also interested to know!

It seems that calling torch.autograd.grad with BOTH set to “True” uses (much) more memory than only setting retain_graph=True. In the master docs retain_graph mentions the “* graph used to compute the grad*”, and create_graph mentions that the “

*”.*

**graph**of the**derivative**will be constructed, allowing to compute higher order derivative productsI found the choice of name for those flags a bit confusing, since it seemed logical on first sight that in order to

*retain*the graph it also need to be

*created*. On second thought and rereading the wording in the docs it seems though that those flags concern two different graphs, as you also asked/suggested @onlytailei.

I think this is the confirmation: