Computation of gradients (numerical computation or analytical derivation)

Hello,

I am new to PyTorch and I have a fundamental question. I want to know how the gradients are computed in the backpropagation process? Is it computed from the analytical closed-form derivation of the function or the numerical differentiation? For example, for the simple ReLU function for which the analytical derivation is simple, do they compute the gradients numerically?

The gradients are not computed numerically. They are computed by a method called “Automatic Differentiation”

These lecture slides explain it well: https://www.cs.toronto.edu/~rgrosse/courses/csc321_2018/slides/lec10.pdf

Thanks for the answer!