I noticed that when I leave grad_outputs as None in autograd.grad I seem to get back the same gradients as when I set it as a sequence of ones (just 1 x 1 in my case). But when I compare the resulting gradient tensors with ==, the results are mostly 0 but sometimes 1 although the numbers seem to be exactly the same.
What does grad_outputs actually do in autograd.grad?
@Krishna_Garg While this answer is by no means comprehensive, I have seen grad_outputs used when calculating higher-order derivative, vector products, e.g., Hessian-vector products.