This is a crosspost from stackoverflow.
I am trying to manually implement gradient descent in PyTorch as a learning exercise. I have the following to create my synthetic dataset:
import torch
torch.manual_seed(0)
N = 100
x = torch.rand(N,1)*5
# Let the following command be the true function
y = 2.3 + 5.1*x
# Get some noisy observations
y_obs = y + 2*torch.randn(N,1)
Then I create my predictive function (y_pred
) as shown below.
w = torch.randn(1, requires_grad=True)
b = torch.randn(1, requires_grad=True)
y_pred = w*x+b
mse = torch.mean((y_pred-y_obs)**2)
which uses MSE to infer the weights w,b
. I use the block below to update the values according to the gradient.
gamma = 1e-2
for i in range(100):
w = w - gamma *w.grad
b = b - gamma *b.grad
mse.backward()
However, the loop only works in the first iteration. The second iteration onwards w.grad
is set to None
. I am fairly sure the reason this happens is because I am setting w as a function of it self (I might be wrong).
The question is how do I update the weights properly with the gradient information?