Hi, I read many articles in ‘discuss.pytorch.org’ community on why parameters are not getting updated. I found many similar challenges like - Optimizer.step() not updating correctly. But, I didn’t know where my network went wrong? Training is not happening in my network. Any inputs and suggestions please.
I am working on JupyNotebook in Conda env.
from sklearn import datasets # to get regression data for model
import torch
from torch.nn import Linear, Sequential, Flatten, MSELoss
from torch.optim import SGD, Adam, LBFGS # stochastic optimizer and gradient descent optimizer
import numpy as np
import matplotlib.pyplot as plt
X = torch.FloatTensor(120, 1).uniform_(0, 2)
Y = 0.3X.pow(3)-0.2X.pow(2)+0.6*X.pow(1)+0.9
degree = 3
p = torch.linspace(0, degree , degree + 1 )
XX = X.pow(p)
model = Sequential(
Linear(degree + 1, 1),
Flatten(0,1)
)
critereon = MSELoss()
optim = Adam(model.parameters(), lr=0.1)
no_of_epochs = 800 # number of iterations that has to be done
for epoch in range(no_of_epochs):
pred = model(XX).unsqueeze(1)
loss = critereon(pred, Y)
loss.backward
optim.step()
optim.zero_grad()
if epoch % 300 == 0:
print('epoch : ', epoch , ‘loss :’, loss.item())
print('polynomial_coefficients : ', list(model.parameters())[0].data.numpy())Preformatted text
This is a python thing that is not specific to pytorch.
At issue is that functions and function-like things are first-class objects
in python. The expression loss.backward evaluates to the backward
method of your loss tensor, but doesn’t call that method. Consider:
import torch
print (torch.__version__)
def twoX (x):
return 2 * x
print ('twoX:', twoX) # a python function is a "first-class" object
print ('type (twoX):', type (twoX))
print ('twoX (3):', twoX (3)) # evaluates twoX (3)
func = twoX # python variables can refer to functions
print ('func:', func)
print ('func (3):', func (3)) # evaluates func (3), which is twoX (3)
# loss.backward is a method of Tensor
# it is a callable object, similar to a function
t = torch.tensor ([5.0], requires_grad = True)
loss = t.pow (2)
print ('t.grad:', t.grad) # t.grad starts out None
print ('loss.backward:', loss.backward) # does not call loss.backward
print ('type (loss.backward):', type (loss.backward)) # it's a method, rather than a function
print ('t.grad:', t.grad) # t.grad is still None
callable = loss.backward # assign loss.backward to a variable
print ('t.grad:', t.grad) # t.grad is still None
print ('callable():', callable()) # call loss.backward() (returns None)
print ('t.grad:', t.grad) # now t.grad has been computed
And its output:
1.10.2
twoX: <function twoX at 0x00000279BB1797B8>
type (twoX): <class 'function'>
twoX (3): 6
func: <function twoX at 0x00000279BB1797B8>
func (3): 6
t.grad: None
loss.backward: <bound method Tensor.backward of tensor([25.], grad_fn=<PowBackward0>)>
type (loss.backward): <class 'method'>
t.grad: None
t.grad: None
callable(): None
t.grad: tensor([10.])