Loss function contains gradient w.r.t. input variables

I am trying to implement a model in this article, where the loss function contains gradient w.r.t. to the input data x. Basically we want the gradient of the NN to approximate a certain function.

In HIPS/autograd, it would be something like

  1. Define the forward path, with input data x and weights w:
    f(x; w) = ...
  2. Take gradient w.r.t. to x:
    dfdx = grad(f, x)
  3. Use such gradient to construct the loss function. We want dfdx to be close enough to our desired dfdx_true, so:
    loss = sum_of_error(dfdx, dfdx_true)
  4. Take gradient of the loss w.r.t. to w, just like in any neural network
    lossgradient = grad(loss, w)
  5. Use any normal training method, for exmple
    w - lr*lossgradient(w, x_data)

In pytorch, before I call loss.backward(), how can I let loss contain the gradient w.r.t. x? Notice that the final result would be a mixed derivative to x and w, not a second-order derivative to w.

In pytorch, the corresponding functions are:

  1. f(x; w) = ...
  2. dfdx = autograd.grad(f, x, df, create_graph=True)
  3. loss = sum_of_error(dfdx, dfdx_true)
  4. optimizer.zero_grad() then loss.backward()
  5. optimizer.ste()
1 Like

What is df in this context? Thanks

1 Like

Is there a way of saving dfdx as a function and just evaluate it for given inputs x?

I think, if f(x) is a linear function of x, it would be possible. However, if there are multi-level hidden features with non-linearities involved, backpropagation needs intermediate feature values to perform a backward pass. Thus, a forward pass is needed. Don’t you think?