# Loss is increasing

I have this simple nonlinerar function that I want to fit data, but loss function keeps increasing after each iteration. Any idea what I am doing wrong?

``````import torch
# w2=2, w1=3

b = 5

x_data = []
y_data = []

for i in range(20):
i = float(i+1)
x_data.append(i)
# y = w2*x^2 + w1*x
y_data.append(i * i * 2 + i * 3)

# our model forward pass
def forward(x):
return x * x * w2 + x * w1

# Loss function
def loss(y_pred, y_val):
return (y_pred - y_val) ** 2

# Before training
print("Prediction (before training)",  4, forward(4).item())

# Training loop
for epoch in range(100):
for x_val, y_val in zip(x_data, y_data):
y_pred = forward(x_val) # 1) Forward pass
l = loss(y_pred, y_val) # 2) Compute loss
l.backward() # 3) Back propagation to update weights
w1.data = w1.data - 0.01 * w1.grad.item()
w2.data = w2.data - 0.01 * w2.grad.item()

# Manually zero the gradients after updating weights

print(l.item())

# After training
print("Prediction (after training)",  4, forward(4).item())
``````

result:
625.0
3969.0
4106.24658203125
10951.541015625
494030.0
146011072.0
176458432512.0
690122123116544.0
7.44382236290397e+18
1.967258245916323e+23
1.1615622546788777e+28
1.4223510151775662e+33

I think .data is deprecated, so without it, neural network will work.
so,

``````w1.data = w1.data - 0.01 * w1.grad.item()
``````

becomes

``````w1 = w1 - 0.01 * w1.grad.item()
``````

also, I think we should use in-place, otherwise a new w1 would be created, and its gradient is None, so zeroing it out is an invalid operation, for example,

``````w1 = w1 - 0.01 * w1.grad.item()
``````

will give error,

``````AttributeError: 'NoneType' object has no attribute 'zero_'
``````

as we created new w1, for which grad is None.

so, we use in-place,

``````w1 -= 0.01 * w1.grad.item()
``````

this way, new w1 will not be created.
also, if we modify w1 like this, then it will give error,

``````RuntimeError: a leaf Variable that requires grad has been used in an in-place operation.
``````

so, we use

``````with torch.no_grad():
``````

this will ensure that these computations have required_grad=False, even when for w1 requires_grad is set as True

also, if you use square in your loss, then it gives big values, as listed in your post, so if we use only difference, and use absolute value of it, then it will start working, for example,

``````# Loss function
def loss(y_pred, y_val):
return (y_pred - y_val).abs()
``````

when I make these changes, and train for 1000 epochs, then I get

``````Prediction (after training) 4 43.99794006347656
``````

thank you for answering, but I donâ€™t see it converging:

here is the updated code:

``````# Loss function
def loss(y_pred, y_val):
return (y_pred - y_val).abs()

# Training loop
for epoch in range(1000):
for x_val, y_val in zip(x_data, y_data):
y_pred = forward(x_val) # 1) Forward pass
l = loss(y_pred, y_val) # 2) Compute loss
l.backward() # 3) Back propagation to update weights

print(l.item())
``````

my last loss values too high and not converging:

``````27.027191162109375
319.6322021484375
58.488525390625
529.6104125976562
127.1507568359375
814.5077514648438
248.1339111328125
1190.4041748046875
Prediction (after training) 4 113.91207122802734
``````

What am I doing wrong?

I also changed learning rate to 0.0001.

got it, it is converging now, also I had to use 50K epoch

thank you!