I’m currently using the angle between two vectors as part of my cost function:

```
def angle_between(vector0, vector1):
unit_vector0 = unit_vector(vector0)
unit_vector1 = unit_vector(vector1)
return unit_vector0.dot(unit_vector1).clamp(-1.0, 1.0).acos()
```

However, I seem to be running into numerical issues (NaNs) during training using this function. I believe it’s happening because the derivative of `acos`

approaches infinity as the input approaches -1 or 1. Even though the derivative of `acos(dot(unit_vector0, unit_vector1))`

is numerical stable (so long the original vector norms are not approaching zero). That is, (assuming the original vectors have a norm of greater than 1) the final gradients computed through this function should max out at 1. However, if the `acos`

overflows, then the derivative on the dot product won’t be able to undo the overflow.

Is there some way to setup this calculation in a numerically stable way? Or does PyTorch already contain some fused function to do this stably? Thank you!