Thanks in advance for your help and suggestions!
I’m wondering if there’s a way to compute the HVP where we may have multiple vectors to be right-multiplied by the Hessian, using a single call to
Concretely, suppose we are given a gradient vector
grad w.r.t parameter
x, we can compute the HVP by
torch.autograd.grad(grad, x, grad_outputs=v)
which will yield
H is the Hessian w.r.t x.
Suppose that, instead of a single
v, we are given a sequence of tensors
(v1, v2, ... , vm) such that each
vi.size() == grad.size(), is there an efficient way to compute
(Hv1, Hv2, ... , Hvm) efficiently, in a single call to
torch.autograd.grad analogous to the computation of
A naive way seems to be feeding each
vi's into a single call to
torch.autograd.grad, but I’m curious to hear if there’s a more efficient implementation.