What's the difference between retain_graph and create_graph?

So retrain_graph is used for the second time backward?
and create_graph is used for higher order derivative of graph parameters?


I’m also interested to know!

It seems that calling torch.autograd.grad with BOTH set to “True” uses (much) more memory than only setting retain_graph=True. In the master docs retain_graph mentions the “graph used to compute the grad”, and create_graph mentions that the “graph of the derivative will be constructed, allowing to compute higher order derivative products”.
I found the choice of name for those flags a bit confusing, since it seemed logical on first sight that in order to retain the graph it also need to be created. On second thought and rereading the wording in the docs it seems though that those flags concern two different graphs, as you also asked/suggested @onlytailei.

I think this is the confirmation: