Receiving 'nan' parameters after first optimization step

I am using a 5 layers fully connected neural network with tanh() activation function. I using it in PINN model, which has worked fine for several times before. It’s not this time.

When i use torch.autograd.set_detect_anomaly(True), following error message appears.

RuntimeError                              Traceback (most recent call last)
<ipython-input-21-268a4603ec9c> in <module>
----> 1 loss.backward()
      2 optimizer.step()

~/.conda/envs/nr_powerai36/lib/python3.7/site-packages/torch/ in backward(self, gradient, retain_graph, create_graph)
    193                 products. Defaults to ``False``.
    194         """
--> 195         torch.autograd.backward(self, gradient, retain_graph, create_graph)
    197     def register_hook(self, hook):

~/.conda/envs/nr_powerai36/lib/python3.7/site-packages/torch/autograd/ in backward(tensors, grad_tensors, retain_graph, create_graph, grad_variables)
     97     Variable._execution_engine.run_backward(
     98         tensors, grad_tensors, retain_graph, create_graph,
---> 99         allow_unreachable=True)  # allow_unreachable flag

RuntimeError: Function 'ReciprocalBackward' returned nan values in its 0th output.

I am not sure how to tackle this error. What exactly is ‘ReciprocalBackward’ function?

ReciprocalBackward points towards a division by the tensor:

x = torch.randn(1, requires_grad=True)
y1= 1/x
y2 = torch.reciprocal(x)

print(y1 == y2)
> tensor([True])
> tensor([-1.2178], grad_fn=<ReciprocalBackward>)

I assume the used tensor might be zero (or close to zero), which would yield Inf as the output and thus also an invalid gradient.
Could you check your model for these operations and make sure the used values are in a reasonable range?