Parametres not updating after optimizer.step()

Hi, I read many articles in ‘discuss.pytorch.org’ community on why parameters are not getting updated. I found many similar challenges like - Optimizer.step() not updating correctly. But, I didn’t know where my network went wrong? Training is not happening in my network. Any inputs and suggestions please.
I am working on JupyNotebook in Conda env.

from sklearn import datasets # to get regression data for model
import torch
from torch.nn import Linear, Sequential, Flatten, MSELoss
from torch.optim import SGD, Adam, LBFGS # stochastic optimizer and gradient descent optimizer
import numpy as np
import matplotlib.pyplot as plt
X = torch.FloatTensor(120, 1).uniform_(0, 2)
Y = 0.3X.pow(3)-0.2X.pow(2)+0.6*X.pow(1)+0.9
degree = 3
p = torch.linspace(0, degree , degree + 1 )
XX = X.pow(p)
model = Sequential(
Linear(degree + 1, 1),
Flatten(0,1)
)
critereon = MSELoss()
optim = Adam(model.parameters(), lr=0.1)
no_of_epochs = 800 # number of iterations that has to be done
for epoch in range(no_of_epochs):
pred = model(XX).unsqueeze(1)
loss = critereon(pred, Y)
loss.backward
optim.step()
optim.zero_grad()
if epoch % 300 == 0:
print('epoch : ', epoch , ‘loss :’, loss.item())
print('polynomial_coefficients : ', list(model.parameters())[0].data.numpy())Preformatted text

Hi Surya!

As posted, your code – loss.backward – doesn’t actually call the
.backward() method of the loss Tensor.

Try loss.backward(). (Note the parentheses.)

Also, check whether your gradients are not None and are non-zero
before calling optim.step(), e.g., print (model[0].weight.grad).

Best.

K. Frank

Thanks @KFrank . It worked for me. network is training now. I wonder why it didn’t throw me an error when I missed the paranthesis at loss.backward.

Hi Surya!

This is a python thing that is not specific to pytorch.

At issue is that functions and function-like things are first-class objects
in python. The expression loss.backward evaluates to the backward
method of your loss tensor, but doesn’t call that method. Consider:

import torch
print (torch.__version__)

def twoX (x):
    return 2 * x

print ('twoX:', twoX)                 # a python function is a "first-class" object
print ('type (twoX):', type (twoX))
print ('twoX (3):', twoX (3))         # evaluates twoX (3)

func = twoX                           # python variables can refer to functions
print ('func:', func)
print ('func (3):', func (3))         # evaluates func (3), which is twoX (3)

# loss.backward is a method of Tensor
# it is a callable object, similar to a function

t = torch.tensor ([5.0], requires_grad = True)
loss = t.pow (2)
print ('t.grad:', t.grad)                               # t.grad starts out None
print ('loss.backward:', loss.backward)                 # does not call loss.backward
print ('type (loss.backward):', type (loss.backward))   # it's a method, rather than a function
print ('t.grad:', t.grad)                               # t.grad is still None
callable = loss.backward                                # assign loss.backward to a variable
print ('t.grad:', t.grad)                               # t.grad is still None
print ('callable():', callable())                       # call loss.backward() (returns None)
print ('t.grad:', t.grad)                               # now t.grad has been computed

And its output:

1.10.2
twoX: <function twoX at 0x00000279BB1797B8>
type (twoX): <class 'function'>
twoX (3): 6
func: <function twoX at 0x00000279BB1797B8>
func (3): 6
t.grad: None
loss.backward: <bound method Tensor.backward of tensor([25.], grad_fn=<PowBackward0>)>
type (loss.backward): <class 'method'>
t.grad: None
t.grad: None
callable(): None
t.grad: tensor([10.])

Best.

K. Frank

@KFrank Thanks. I understood it. I am thankful for the time you have put into this.