This is a crosspost from stackoverflow.
I am trying to manually implement gradient descent in PyTorch as a learning exercise. I have the following to create my synthetic dataset:
import torch torch.manual_seed(0) N = 100 x = torch.rand(N,1)*5 # Let the following command be the true function y = 2.3 + 5.1*x # Get some noisy observations y_obs = y + 2*torch.randn(N,1)
Then I create my predictive function (
y_pred) as shown below.
w = torch.randn(1, requires_grad=True) b = torch.randn(1, requires_grad=True) y_pred = w*x+b mse = torch.mean((y_pred-y_obs)**2)
which uses MSE to infer the weights
w,b. I use the block below to update the values according to the gradient.
gamma = 1e-2 for i in range(100): w = w - gamma *w.grad b = b - gamma *b.grad mse.backward()
However, the loop only works in the first iteration. The second iteration onwards
w.grad is set to
None. I am fairly sure the reason this happens is because I am setting w as a function of it self (I might be wrong).
The question is how do I update the weights properly with the gradient information?