Gradient Descent in PyTorch

sachinruk · September 7, 2018, 2:50am

This is a crosspost from stackoverflow.

I am trying to manually implement gradient descent in PyTorch as a learning exercise. I have the following to create my synthetic dataset:

import torch
torch.manual_seed(0)
N = 100
x = torch.rand(N,1)*5
# Let the following command be the true function
y = 2.3 + 5.1*x
# Get some noisy observations
y_obs = y + 2*torch.randn(N,1)

Then I create my predictive function (y_pred) as shown below.

w = torch.randn(1, requires_grad=True)
b = torch.randn(1, requires_grad=True)
y_pred = w*x+b
mse = torch.mean((y_pred-y_obs)**2)

which uses MSE to infer the weights w,b. I use the block below to update the values according to the gradient.

gamma = 1e-2
for i in range(100):
  w = w - gamma *w.grad
  b = b - gamma *b.grad
  mse.backward()

However, the loop only works in the first iteration. The second iteration onwards w.grad is set to None. I am fairly sure the reason this happens is because I am setting w as a function of it self (I might be wrong).

The question is how do I update the weights properly with the gradient information?

Deepali · September 7, 2018, 5:02am

Please check with inplace operation on w and b.

w -= gamma *w.grad
b -= gamma *b.grad

InnovArul · September 7, 2018, 5:18am

You are calling mse.backward() 100 times. But you are performing the operation only first time. Please place the operation y = wx + b inside the loop.

sachinruk · September 8, 2018, 12:38am

I’m afraid I get the error: RuntimeError: a leaf Variable that requires grad has been used in an in-place operation.

ptrblck · September 8, 2018, 12:39am

Try to wrap it into

with torch.no_grad():
    w -= gamma * w.grad

and run it again.