NaN gradient in torch.hypot

Hi all,

Back in 2017, it was decided that torch.norm would have a zero subgradient at zero (Norm subgradient at 0 by albanD · Pull Request #2775 · pytorch/pytorch · GitHub).
Applying the same logic, shouldn’t torch.hypot have a zero subgradient at (0, 0)?

Currently, torch.hypot gives NaNs in gradient for (0, 0) inputs but is otherwise equivalent to torch.norm of the concatenation/stacking of two tensors.

I’m sure such a change would be acceptable if it comes with tests etc.