I’m using the new autograd functional API and I have a question about deriving a ReLU-activated network. Is it possible to compute a hessian of the output of a neural network with respect to the inputs using ReLU activations? I imagine it would be if the output was activated by a sigmoid or softmax, but what if the output is just raw?
Hi,
If the output is “raw” then your network as a whole is piece-wise linear.
So the Hessian will just be full of 0s (except if you evaluate at a point where it is non-differentiable).