LGBFS not converge on GPU but CPU works fine

I am running some training, and the code is as in PINNs/Burgers Inference (PyTorch).ipynb at master · jayroxis/PINNs · GitHub
I found that the training has very different behaviors on CPU (converge to small values as shown in the repo) and GPU (not converge - stop at loss=0.08).

May I ask for possible reasons for this?