nn.Mish second derivative

ipostr08 · November 30, 2021, 11:26pm

When I try to get a Hessian of a net using nn.Mish on a GPU I get Nans. I see that exp() is used in the C++ code, which could be the reason. Is being able to get the second derivative of various internally implemented functions something expected or not?

AlphaBetaGamma96 · November 30, 2021, 11:46pm

You can manually implement a custom function within the python API via torch.custom.autograd (with a second derivative as well)
https://pytorch.org/docs/stable/autograd.html?highlight=unction#torch.autograd.Function

ipostr08 · December 1, 2021, 12:25am

Yes, that’s what I did. But I’m wondering about the general philosophy regarding second derivatives of various internal functions since some of them are faster and more thoroughly examined than custom Python implementations.