Simple Linear Regression with Pytorch

Babatunde_A · December 12, 2017, 3:12pm

Please, I am New to Pytorch and trying my hands on it’s capability so I am trying to train a simple linear regression on the popular Boston Datasets.
This is my code:

from sklearn.datasets import load_boston
import torch
import pandas as pd
import matplotlib.pyplot as plt
import torch.nn.functional as F
import torch.nn as nn
from torch.autograd import Variable
import numpy as np


boston = load_boston()

data = pd.DataFrame(boston['data'], columns=boston['feature_names'])

target = pd.Series(boston['target'])

data.shape, target.shape

dataA = Variable(torch.from_numpy(data.values).float())
y = Variable(torch.from_numpy(target.values).float())

linear = nn.Linear(data.shape[1], 1)

criterion = torch.nn.MSELoss()
optimizer = torch.optim.SGD(linear.parameters(), lr=0.01)
loss2 = []


for i in range(5):
    optimizer.zero_grad()
    outputs = linear(dataA)
    
    loss = criterion(outputs, y)
    loss2.append(loss.data[0])
    loss.backward()        

    optimizer.step()
    
plt.plot(range(5), loss2)
plt.show()

print(loss2)

The issue is that the loss keeps going up, can anyone please help look through and help find out what can be done differently

richard · December 12, 2017, 4:05pm

Decreasing the learning rate to 0.0000001 seems to work for me, as does switching to another optimizer like Adam. I’m not sure why that works, but I hope that helps.

Babatunde_A · December 12, 2017, 4:10pm

Hi Richard, thanks, funny I tried to normalize the data and it works also!

richard · December 12, 2017, 4:11pm

Right, normalizing the data is important to prevent the gradients from exploding

Babatunde_A · December 12, 2017, 4:16pm

But will need to get the intuition around why a 0.0000001 learning rate works too. Tried it and yes it even decreases up to 0

richard · December 12, 2017, 4:17pm

I think the reason why the small learning rate works is that because the values in the data are large, the small learning rate prevents the gradients from exploding. That’s why normalizing the data helps

Babatunde_A · December 12, 2017, 4:23pm

That’s through too and logical. Thanks Richard. Can check this out too https://machinelearningmastery.com/adam-optimization-algorithm-for-deep-learning/

talasinski · May 24, 2019, 4:09am

See my rely/code to PyTorch fails to (over)fit Boston housing dataset

The code seems work.