Simple Linear Regression with Pytorch

Please, I am New to Pytorch and trying my hands on it’s capability so I am trying to train a simple linear regression on the popular Boston Datasets.
This is my code:

from sklearn.datasets import load_boston
import torch
import pandas as pd
import matplotlib.pyplot as plt
import torch.nn.functional as F
import torch.nn as nn
from torch.autograd import Variable
import numpy as np


boston = load_boston()

data = pd.DataFrame(boston['data'], columns=boston['feature_names'])

target = pd.Series(boston['target'])

data.shape, target.shape

dataA = Variable(torch.from_numpy(data.values).float())
y = Variable(torch.from_numpy(target.values).float())

linear = nn.Linear(data.shape[1], 1)

criterion = torch.nn.MSELoss()
optimizer = torch.optim.SGD(linear.parameters(), lr=0.01)
loss2 = []


for i in range(5):
    optimizer.zero_grad()
    outputs = linear(dataA)
    
    loss = criterion(outputs, y)
    loss2.append(loss.data[0])
    loss.backward()        

    optimizer.step()
    
plt.plot(range(5), loss2)
plt.show()

print(loss2)

The issue is that the loss keeps going up, can anyone please help look through and help find out what can be done differently

Decreasing the learning rate to 0.0000001 seems to work for me, as does switching to another optimizer like Adam. I’m not sure why that works, but I hope that helps.

Hi Richard, thanks, funny I tried to normalize the data and it works also!

1 Like

Right, normalizing the data is important to prevent the gradients from exploding :smiley:

1 Like

But will need to get the intuition around why a 0.0000001 learning rate works too. Tried it and yes it even decreases up to 0

I think the reason why the small learning rate works is that because the values in the data are large, the small learning rate prevents the gradients from exploding. That’s why normalizing the data helps

That’s through too and logical. Thanks Richard. Can check this out too https://machinelearningmastery.com/adam-optimization-algorithm-for-deep-learning/

See my rely/code to PyTorch fails to (over)fit Boston housing dataset

The code seems work.