I’m currently using the angle between two vectors as part of my cost function:
def angle_between(vector0, vector1): unit_vector0 = unit_vector(vector0) unit_vector1 = unit_vector(vector1) return unit_vector0.dot(unit_vector1).clamp(-1.0, 1.0).acos()
However, I seem to be running into numerical issues (NaNs) during training using this function. I believe it’s happening because the derivative of
acos approaches infinity as the input approaches -1 or 1. Even though the derivative of
acos(dot(unit_vector0, unit_vector1)) is numerical stable (so long the original vector norms are not approaching zero). That is, (assuming the original vectors have a norm of greater than 1) the final gradients computed through this function should max out at 1. However, if the
acos overflows, then the derivative on the dot product won’t be able to undo the overflow.
Is there some way to setup this calculation in a numerically stable way? Or does PyTorch already contain some fused function to do this stably? Thank you!