Why gradient becomes 10 times smaller?

When I tested the automatic differential, the gradient didn’t look the same as what I calculated manully

import numpy as np
import  torch

data = torch.from_numpy(np.array([[0.5]],dtype=np.float32))
target  =torch.from_numpy(np.array([[5.5],[7.5]],dtype=np.float32))
W = torch.tensor([[2.0],[4.0]],requires_grad=True)
out_put =data*W
print("out_put",out_put)
loss = target-out_put
print("loss",loss)
loss.mean().backward()
print("loss.mean",loss.mean())
print("W.grad",W.grad)
out_put tensor([[1.],
        [2.]], grad_fn=<MulBackward0>)
loss tensor([[4.5000],
        [5.5000]], grad_fn=<SubBackward0>)
loss.mean tensor(5., grad_fn=<MeanBackward0>)
W.grad tensor([[-0.2500],
        [-0.2500]])

I think it will be -2.5,but get -0.25.Why does this happen?

your objective is

J = (.5*w-target)/2 #divicion is because of meam
dJ/dw = d(.25*w+target/2)/dw = .25

see, no problem.

1 Like