# Loss function coming from a PDE

Hello,

I’m trying to train a PINN and I’m computing the physics loss using FEniCS so I am having to detach a torch tensor and then I have to provide an implementation of the gradient. When I train the loss stays constant after each epoch, it doesnt increase or decrease. This probably means that the gradient of the loss that torch is obtaining is zero. I would appreciate some help figuring out what I’m doing wrong as well as any suggestions on the best practice for training PINNs this way.

Here is the implementation that I’m using:

This is the autograd subclass that Im using:

``````class PDE_Loss(torch.autograd.Function):
def __init__(self):
super(PDE_Loss, self).__init__()
#p_dofs: this is the output of the neural net
#A_matrix_params: this is the input to the neural net
#formulation is a class that computes the necessary vectors and matrices using
#FEniCS
@staticmethod
def forward(
ctx,
p_dofs: torch.tensor,
A_matrix_params: torch.tensor,
lossfun: torch.nn.Module,
formulation: Dg.Darcy_primal_formulation,
f: torch.tensor,
) -> torch.tensor:
result = lossfun(
torch.matmul(
torch.from_numpy(
formulation.assemble_linear_system(A_matrix_params.detach().numpy())
),
p_dofs,
),
f,
)
ctx.formulation = formulation
ctx.lossfun = lossfun
ctx.save_for_backward(
p_dofs,
A_matrix_params,
f,
result,
)
# print(f"{result=}")
return result
#Gradient is estimated using finite differences
@staticmethod
eps = 1e-8
(
p_dofs,
A_matrix_params,
f,
forward_res,
) = ctx.saved_tensors

formulation = ctx.formulation
lossfun = ctx.lossfun

for i in range(3):
A_matrix_params_eps_away = A_matrix_params.clone()
A_matrix_params_eps_away[i] += eps

result_eps_away = lossfun(
torch.matmul(
torch.from_numpy(
formulation.assemble_linear_system(
A_matrix_params_eps_away.numpy()
)
),
p_dofs,
),
f,
)
return None, torch.tensor(np.array(grad)), None, None, None
``````

Hi Sebastian!

In principle you should be able to make this approach work.

a single forward pass / backward pass (rather than observing that your
training doesn’t work and trying to deduce that you’re getting a zero
gradient). Take a simple test case for which you are comfortable that
you can reliably compute the gradient with some independent method,
say analytically or with an independent numerical computation.

Does your `backward()` method (as called by running a forward pass
followed by a backward pass) reproduce the known gradient for your
test case? Does it incorrectly return zero?

Note that your `eps = 1e-8` is at or below the edge of single-precision
round-off error so that `A_matrix_params_eps_away[i] += eps` may
end up being numerically equal to `A_matrix_params`, which would

If you can find an analytic or semi-analytic formula for the gradient, you
will almost certainly be better off using that than numerical differentiation.
If you have to use numerical (finite-difference) differentiation, it should
be possible to make it work adequately well, although such computations
can be nuanced and rather tricky.

Start by making `eps` bigger than `1e-8`. You might consider performing
the numerical differentiation in double precision if doing so is compatible

In general, if you make `eps` too big, your finite-difference estimate will
become polluted with higher-order terms. If you make it too small, your
estimate will become noisy due to round-off error, and at some point
`eps` will, so to speak, round down to zero. I would suggest that you
take a selection of “typical” use cases and try to empirically determine
a reasonable value for `eps`.

You might also look for an existing numerical-differentiation package.
A good one will likely give you more robust results than something you
write yourself. However, such a package will also tend to be a bit tricky
to use – you will probably have to provide it some sort of “control
parameters” that, in effect, give it some guidance as to how large it
should make its internal equivalent of `eps`.

Good luck!

K. Frank