Hello everybody!
I am quite a novice in PyTorch and Python also. The reason I try to familiarize myself with PyTorch is the application of ANNs to PDEs which seems an extremely interesting topic. So I assigned myself to extend this implementation NNets-and-Diffeqns/non-linear-pde-final.ipynb at master · dbgannon/NNets-and-Diffeqns · GitHub to 2D. More specifically this code approximates the solution of a 1D Poisson equation with a feedforward neural network. I would like to ivestigate, along the same lines, the following 2D BVP

where Delta is the Laplace operator. An analytic solution is u_exact=sin(pi x) sin(pi y). The extension of some points is straightforward, but I got stuck in the imposition of the 2D boundary condition i.e. the appropriate modification of In [67] and the modification of In [70] where, frankly, I don’t really understand what it does (Creates mini batches?). Could someone provide some guidelines for my implementation or/and other related implementations and projects using PyTorch?

so the notebook you linked pre-generates 20 batches of 10 random evaluation points in the interior (batches) as well as the rhs (fbatches) of the same shape and 2 evaluation points on the boundary (and for 1d, the set of these two points is the boundary), the latter in 67 as variable bndry. It then sets Truebndry to zeros of the same shape as bndry.
For both bndry and batches the points are stacked along the batch dimension.

A training step roughly looks like

zero gradients,

set b, fb to a random batch of 10 points of (f)batches

evaluation of the mynet at bndry and assign it to output_bndry,

in the function Du: evaluation of mynet at b, computation of the second derivatives using autograd.grad and returning them as outputb in the train loop,

use mse loss (mean squared error) to compute the discrepancy between outputb and fb in the interior,

use mse loss to compare output_bndry to the desired boundary values Truebndry.

backwards through both separately and calling the two different optimizers (but without calling zero grad between the first step and the second backward, which is most certainly not intended, as the second will get the gradients of the first, too).

Some notes on this. Keep in mind that this critique is only intended to articulate some things where I thought a different approach might be better in terms of learning PyTorch, Python and programming for numerical analysis, and is neither complete nor do I intend to fault of the original author of the code (i.e. if the code does what they need for them - great - but maybe don’t follow it too closely for learning):

You would need to do some non-trivial thinking around the evaluation points (and probably need many more) to do 2d,

note that output_bndry is values of the would-be solution while outputb is values of the second derivatives of the would-be solution,

The two optimizers and backwards are almost certainly not a good idea. A way to achieve the equivalent much cleaner is to compute a weighted average of the two losses and then call backward from this average. Note that to compute the loss weights you have to incorporate the number of points in the batch and different learning rates. Calling backward with retain_graph all the time is not a good idea, either, but calling it only once lets you get rid of it easily.

And as an aside, extensively uses the tensor creation style that has been considered obsolete 3 years ago when PyTorch 0.4 came out and seems is needlessly complicated in many aspects. By itself, that seems OK, but I would advise against using this code to learn PyTorch, Python or coding.

Again, the above critique is merely a suggestion for your own learning path/ideas how to write more clearer code achieving the same and does not contain any criticism of the author’s code itself which may or may not achieve what the author had intended to do with it.

Thomas, thank you for your comprehensive reply! I have a better understanding of the
code now and I agree that some points are unnecessarily complicated. Also, I agree that the two backwards is not a good idea, as you wrote, the weighted sum of the two losses is a better and more standard approach. I found a way to generate the 2D set of evaluation points and now I should find out how to feed them in the network. Obviously the first nn.Linear should has two inputs now instead of one.