Numerically stable acos dot product derivative?

I’m currently using the angle between two vectors as part of my cost function:

def angle_between(vector0, vector1):
    unit_vector0 = unit_vector(vector0)
    unit_vector1 = unit_vector(vector1)
    return unit_vector0.dot(unit_vector1).clamp(-1.0, 1.0).acos()

However, I seem to be running into numerical issues (NaNs) during training using this function. I believe it’s happening because the derivative of acos approaches infinity as the input approaches -1 or 1. Even though the derivative of acos(dot(unit_vector0, unit_vector1)) is numerical stable (so long the original vector norms are not approaching zero). That is, (assuming the original vectors have a norm of greater than 1) the final gradients computed through this function should max out at 1. However, if the acos overflows, then the derivative on the dot product won’t be able to undo the overflow.

Is there some way to setup this calculation in a numerically stable way? Or does PyTorch already contain some fused function to do this stably? Thank you!

I’m not sure you can set this up in a numerically stable way because as you said, the problem is that if the input approaches -1, 1 the derivative of acos goes to infinity.

Is it possible to just use the dot product as a part of the loss function, and not include the acos?

I had the same problem.

Using clamp(-1.0 + eps, 1.0 - eps) instead of clamp(-1.0, 1.0) made it for me…

1 Like

@richard: It might be useable, but it’s not ideal for the loss I’m hoping for. But I may try it anyway, thanks!

@hbredin: This was my plan if no one had anything better. But hearing that you’ve already taken this route means that I’m probably right in expecting a better choice won’t show up. Thanks!