Hi, I’m trying to get a hessian vector product of a network.
I try to follow the hvp implemented in Tensorflow. But the following codes don’t work as expected. Does anybody know how to solve it?
Thank you in advance.
input = torch.randn(1, 3, 32, 32) # Batch_size is 1.
out = net(input).sum() # net is a neural network
para_list = [x for x in net.parameters()]
grads = autograd.grad([out], para_list, retain_graph=True, create_graph=True)
elem_prod = [g * v for g, v in zip(grads, list_of_v)] # Here list_of_v is a list of vector v. Each v is corresponding to a parameter in para_list.
hvps = autograd.grad(elem_prod, para_list, create_graph=True)
The error says:
RuntimeError: grad can be implicitly created only for scalar outputs.
If my implementation is totally wrong, what is the correct way of doing this?
Thanks again.