I am required to compute my loss function as below:
where “Theta” is the parameters of the network (i.e. weights) and “f(Theta)” is the output network and “y” is a real label and “x” is the input sample.
would you please give me a tip to do that, by sample code?
how can I compute this in Pytorch during the train of my network?
You can find in this gist a function that shows how to compute the hessian.
But you can compute the dot product with f-y much more efficiently using Rop from this gist. In particular it will take
Rop(loss, theta, f(theta) - y).
How about the first gist, Do you mean to use the following function?
where “y” is “loss” and “x” is “theta” in my example. Is it right?
def jacobian(y, x, create_graph=False):
jac = 
flat_y = y.reshape(-1)
grad_y = torch.zeros_like(flat_y)
for i in range(len(flat_y)):
grad_y[i] = 1.
grad_x, = torch.autograd.grad(flat_y, x, grad_y, retain_graph=True, create_graph=create_graph)
grad_y[i] = 0.
return torch.stack(jac).reshape(y.shape + x.shape)
Yes you want the gradient of the loss wrt theta.