Hello. I am struggling with implementing a solution to the following problem using pytorch, so I wonder if I can get some feedback in this forum. Let us imagine I want to fit a 2D model M(x,y) (some unspecified NN model) to reproduce the values of a certain function f(x,y). I know how to do this in pytorch no problem; but now let’s imagine that as well as function values f(x,y), I can easily get df/dx and df/dy. Then, naturally one would think that it should be possible to use this additional information to better fit the model M, right?

It would seem that one could set the requires_grad flag to true for both x and y above to obtain the derivatives dM/dx and dM/dy after back propagation, and define the loss function so that the parameters of M reproduce not only the values of f(x,y), but also its derivatives with respect to the variables x and y. I would expect that proceeding in this way made the optimization of parameters probably faster and more robust.

The question is how does one implement this idea in pytorch? I have a clear understanding of the typical fitting cycle when one is using only function [f(x,y)] information, but I struggle to generalise this cycle when one has both function and derivative information. Can someone offer any guidance on how to proceed? Any simple examples?

Thanks!