I am currently engaged in training a neural network to learn from a set of data points that contain information about the derivatives, represented as
Train Data = {x_i, f’(x_i)} with i=100
Additionally, I have supplementary data available in the form of
{(f_k}, where k=100^2.
For the latter set I do not have the corresponding $x_k$. My goal is now to find a approximate the function using a neural network.
My proposed approach involves two local loss function defined as follows:
L_1 = Sum_i || N’(x_i) - f’(x_i)|| ^2
L_2 = Sum_i (argmin_k ||N(x_i) - f_k||^2
Is it then possible to define the loss function as
L = L_1 + L_2
and is it legit to follow this appropach? I am not sure since L_2 contains a closest point search for each epoch.