Why should paramters for modules be a 'leaf tensor'?

I encountered this error message in the above line, while developing MAML-like architecture which requires to calculate high order derivatives of accumulated gradient of parameters.

I wanted to set new parameters with grad_fn, but as shown in the above line of codes, pytorch requires parameters of modules should be leaf tensors.

Is there any reason for this? Or can I ignore this error message(eg. delete the line)?

As far as I can find, high-order autograd works for every Functions in pytorch(eg. convNd, linear, …) so I guess it is not a problem when I change network parameters with the non-leaf ones.

You can’t do that currently in PyTorch. It’s a bit of an administrative problem more than a fundamental one (albeit not an entirely trivial one, I once implemented a PoC for this).
Currently, the functional interfaces and computing things in forward is what you have to get along with.

Best regards


1 Like