Is it proper way to calculate second order gradient?

591ece3a2a04495180b7 · April 10, 2019, 4:26pm

theta’ = theta - d(Loss(theta, target)) / d theta

this is common way of gradient descent.
I want to update theta such as

theta = theta - d (Loss(theta’, target)) / d theta

and maybe it need second order gradient of theta at the Loss made by theta’ and target.

Then, how to code for this way?

If I call loss.backward after first update of theta, then I would get d Loss(theta’, target) / d theta’.

Then, If I change the parameters of my model from theta’ to theta, then

will backward function compute d Loss(theta’, target) / d theta ?