I am not sure if you are asking what input.grad represents or the full formula (which would require to write the full derivation of the network w/ the chain rule).

input.grad gives you the gradients of output w.r.t. input (since you ran output.backward()).

Maybe this tutorial could help you gain some intuition about what is happening behind.