In the PyTorch documentation, the results of vhp and hvp are same.
But according to the notes:
If your functions is twice continuously differentiable, then hvp = vhp.t()
Usually, they should be the same. Could anyone tell me the reasons or point out where I misunderstood? Thanks a lot!