I am trying to implement a model in this article, where the loss function contains gradient w.r.t. to the input data *x*. Basically we want the **gradient** of the NN to approximate a certain function.

In HIPS/autograd, it would be something like

- Define the forward path, with input data
`x`

and weights`w`

:

`f(x; w) = ...`

- Take gradient w.r.t. to
`x`

:

`dfdx = grad(f, x)`

- Use such gradient to construct the loss function. We want
`dfdx`

to be close enough to our desired`dfdx_true`

, so:

`loss = sum_of_error(dfdx, dfdx_true)`

- Take gradient of the loss w.r.t. to
`w`

, just like in any neural network

`lossgradient = grad(loss, w)`

- Use any normal training method, for exmple

`w - lr*lossgradient(w, x_data)`

In pytorch, before I call `loss.backward()`

, how can I let `loss`

contain the gradient w.r.t. `x`

? Notice that the final result would be a mixed derivative to `x`

and `w`

, not a second-order derivative to `w`

.