SGD update not using loss value, while calculating the update

I was playing around with gradients, and optimizers, as I wanted to understand it from the roots. I created a toy example with a single convolutional layer, to understand how the gradients are calculated and updated.

From what I have tried, It seems like the update is using only the gradient and and not the loss value itself.
like so:

say :
w_new = new parameter
w = old parameter
lr = learning rate
d_J_w = gradient of J wrt. w
J = loss

The operation that occurs looks like :
w_new = w - (lr * d_J_w)

BUT, shouldnt it actually be
w_new = w - (lr * d_J_w * J)

Why does this happen?

Below is my code for reference


output :

I may be missing something very trivial, sorry, but, I am pretty new to pytorch.

Hello, just a suggestion, but it would be great if you could add more comments to you code, it’ll make reading it faster.
So the learning process is generally like this:
New Weights = Old Weights - (Learning Rate * Gradient of weight wrt Loss)
So no, loss is not required in the above process. Also the Jacobian (J) is for the w with respect to Loss, and not the other way round.