Do I need to implement gradient for this?

Imagine a situation in which my neural network takes coordinates of the planets at some time t, and outputs coordinates at a time tt. Due to the nature of my simulation which I am using to train this network, it’s much more convenient to compare distances of one fixed planet to all others than directly all coordinates which is why in a loss function I would like to have a term like

L = sum_over_batch [ (Distances_by_model - Distances_by_ground_truth)**2 ]

Just to make clear, distance is a function of coordinates of planets which maps an array of coordinates of shape (N_planets,3) to a vector (N_planets,1).

Finally, my question is: since my NN outputs coordinates, and not distances, but I have distances in a loss functions, do I have to implement autograd distance function? If not, could you name an example in which I would have to implement it?

Thanks in advance,
confused physics student

If you can express your loss function as a sequence of differentiable operations on the ground truth and the output of the model that exist in PyTorch, then you would not need to implement your own autograd distance function. As a toy example, if your loss function is something like (((model_x1 - modelx2)**2 + (model_y1 - modely2)**2) - (ground_truth_val1))**2 then you would not need to write your own autograd function as the basic operations of addition, subtraction, and squaring are all differentiable and have their own autograd implementations that can be chained together.

However, if in the future you wanted to fuse your loss function into a single kernel for higher performance (speed/efficiency) then you might want to look into writing your own autograd function + custom kernel. Otherwise it wouldn’t be necessary.

1 Like