Why does it no learn?

Hi,
why is the following code not learning? The loss is always at 0.25 and the result looks random…
Changing the network topology, optimizer, etc. is not changing a bit…
Any help appreciated, Thanks, regards Oliver

import torch
import torch.nn as nns
import numpy as np
from sklearn.datasets.samples_generator import make_blobs

x, y = make_blobs(n_samples=1000, centers=2, n_features=2)
x_data = torch.Tensor(x)
y_data = torch.Tensor(y)
x_data.requires_grad_(True)
y_data.requires_grad_(True)

model = torch.nn.Sequential(
torch.nn.Linear(2, 10),
torch.nn.Sigmoid(),
torch.nn.Linear(10, 10),
torch.nn.Sigmoid(),
torch.nn.Linear(10, 1),
)

optimizer = torch.optim.Adam(model.parameters())

loss_fn = torch.nn.MSELoss()

for t in range(1000):
y_pred = model(x_data)

  loss = loss_fn(y_data, y_pred)
  if t%100==0:
        print(t, loss.item())  

  model.zero_grad()
  loss.backward()

  optimizer.step()

Could you print the shape of y_pred and y_data?
Both should have the same shape, i.e. [batch_size, 1] in your case.
In the past we’ve had similar issues where dim1 was missing in the target tensor, so that an unwanted broadcasting was performed. The latest PyTorch version will throw a warning, if this shape mismatch is detected.

1 Like

print(y_data.shape, y_pred.shape )
gives
torch.Size([10]) torch.Size([10, 1])

Add dim1 to your target and run your code again:

y_data = y_data.unsqueeze(1)
loss = loss_fn(y_pred, y_data)

Thanks! I really appreciate the help