Loss function contains gradient w.r.t. input variables

Jiawei_Zhuang · October 18, 2017, 3:48pm

I am trying to implement a model in this article, where the loss function contains gradient w.r.t. to the input data x. Basically we want the gradient of the NN to approximate a certain function.

In HIPS/autograd, it would be something like

Define the forward path, with input data x and weights w:
f(x; w) = ...
Take gradient w.r.t. to x:
dfdx = grad(f, x)
Use such gradient to construct the loss function. We want dfdx to be close enough to our desired dfdx_true, so:
loss = sum_of_error(dfdx, dfdx_true)
Take gradient of the loss w.r.t. to w, just like in any neural network
lossgradient = grad(loss, w)
Use any normal training method, for exmple
w - lr*lossgradient(w, x_data)

In pytorch, before I call loss.backward(), how can I let loss contain the gradient w.r.t. x? Notice that the final result would be a mixed derivative to x and w, not a second-order derivative to w.

albanD · October 18, 2017, 3:54pm

In pytorch, the corresponding functions are:

f(x; w) = ...
dfdx = autograd.grad(f, x, df, create_graph=True)
loss = sum_of_error(dfdx, dfdx_true)
optimizer.zero_grad() then loss.backward()
optimizer.ste()

ppower1 · December 4, 2020, 8:40pm

What is df in this context? Thanks

eva.die · April 21, 2022, 10:53am

Is there a way of saving dfdx as a function and just evaluate it for given inputs x?

InnovArul · April 21, 2022, 1:58pm

I think, if f(x) is a linear function of x, it would be possible. However, if there are multi-level hidden features with non-linearities involved, backpropagation needs intermediate feature values to perform a backward pass. Thus, a forward pass is needed. Don’t you think?