How is gradient calculated?

Hi,

I was wondering how does PyTorch calculate the gradient since I am interested in using my own loss function. I saw that torch.autograd.functional.jacobian calculates the derivative of the values that it gets using a formula.

I tried to go through the code in GitHub but I’m not able to exactly see the code which interprets the formula used (eg if we use np.exp to calculate the value, then the derivative is the same value, Or if we use x**2, then the jacobian will print 2x). How does it interpret which mathematical operation is being performed and how to calculate the derivative of that.

Also, does it calculate the derivative of non-differentiable functions?

Please let me know.

Hi,

PyTorch uses automatic differentiation to compute all the gradients. See here for more info about AD.

Also, does it calculate the derivative of non-differentiable functions?

For functions that are non-differentiable only at a single point, we extend the gradient with either a subgradient or by continuity.
For fonctions that are more generally non-differentiable, it will raise an error if you try to give a Tensor that requires gradients to them.