Hi,
I’ve been reading about graphs and autograd and bit and came across the following:-
Pytorch doesn’t compute Jacobian and instead uses VJP directly to calculate the derivates directly.
So I had a few questions in mind:-
- What exactly does VJP do? I understand it gives us product between vector and Jacobian but what is that?
- What is v, it’s cotangent but what role does it play? By default it’s a tensor of 1 which people keep saying is “differentiation by self”. But that’s just adding up the columns, how does adding columns give you derivative?
- If possible, can someone explain how you calculate VJP directly from the graph?
Been looking everywhere at this moment and with each article I read it seems no one directly addresses this topic. Kindly Help
Thanks!