I was looking at the famous paper MAML (Model Agnostic Meta-Learning) and they say they use Hessian-Vector products. How do they use them:
-
in mathematics? i.e. how do the Hessian vector products get involved in the update rule?
-
in the code. How do they update the parameters being learned?
Or is there a nice example of how Hessian Vector products are used in (meta) training?
cross posted: