I was reading the Optional Reading: Tensor Gradients and Jacobian Products section of this blog and it stated:
In many cases, we have a scalar loss function, and we need to compute the gradient with respect to some parameters. However, there are cases when the output function is an arbitrary tensor. In this case, PyTorch allows you to compute the so-called Jacobian product and not the actual gradient.
So, does it means that Jacobian product is calculated only for the arbitrary tensor i.e. non-scalar tensor & scalar tensors gradients are calculated in different way?