Simple linear regression is somehow not working

I want to run a simple linear regression. But for some reason, it’s not working correctly.
Linear Regression: y=weights*x+bias

# Trainings data x
x = Variable(torch.Tensor([[1, 2.0],
                           [1, 3.0],
                           [1, 7.0],
                           [1, 9.0]]))

#x = (x - x.mean()) / x.max()

# True labels y
y = Variable(torch.Tensor([2.0,3.0,7.0,9.0]))
#y = (y - y.mean()) / y.max()

# Weights
weights = Variable(torch.randn(2,1), requires_grad=True)

# Bias
bias = Variable(torch.randn(1), requires_grad=True)

optimizer = torch.optim.Adam([weights, bias], lr=0.00001)

# Actual training
loss_history = []
for i in range(10000):
    out = x.mm(weights).add_(bias)
    loss = torch.mean((out - y)**2)
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()
    
    loss_history.append(loss.data[0])

Final result:
loss = 0.2546786069869995
weights = (-1.2875, 0.3234) => should be: 0, 1
bias = -0.3301 => should be: 0

Why is the result so far off from the expected result?

Here is the solution:

out = out.view(-1)

What was the mistake?
out is a (4,1) vector and y is a (4,) vector.
The subtraction out - y created a (4,4) vector as opposed to a (4,1) vector.
This caused all following calculations to be incorrect.

1 Like

Yes, that is the result of the broadcasting semantics

2 Likes