Simple regression problem - NN will not converge

endrew · April 27, 2023, 5:59pm

I am training a neural network to predict the frequency of a sine wave. What happens is that the network learns to always predicts the mean value of the labels (the frequencies in the training data). What am I doing wrong?

Please see google colab code here:

Thanks!

eqy · April 28, 2023, 7:17am

The warning given:

/usr/local/lib/python3.10/dist-packages/torch/nn/modules/loss.py:536: UserWarning: Using a target size (torch.Size([10])) that is different to the input size (torch.Size([10, 1])). This will likely lead to incorrect results due to broadcasting. Please ensure they have the same size.
  return F.mse_loss(input, target, reduction=self.reduction)

is important here, as the MSE calculation will incorrectly broadcast yhat - y_batch to a shape of [10, 10] during the difference calculation before computing the mean. This broadcasting causes MSE to incorrectly exhibit the averaging effect observed as it would compute the “mean squared error” between each data point and all of the labels, not just the corresponding one.

By fixing the output shape with
yhat = yhat.squeeze()
I get

tensor(0.0034, grad_fn=<MseLossBackward0>)
tensor(0.0014, grad_fn=<MseLossBackward0>)
tensor(0.0003, grad_fn=<MseLossBackward0>)
tensor(0.0030, grad_fn=<MseLossBackward0>)
tensor(0.0097, grad_fn=<MseLossBackward0>)
tensor(0.0008, grad_fn=<MseLossBackward0>)
tensor(0.0006, grad_fn=<MseLossBackward0>)
tensor(0.0005, grad_fn=<MseLossBackward0>)
tensor(0.0010, grad_fn=<MseLossBackward0>)
tensor(0.0005, grad_fn=<MseLossBackward0>)

endrew · April 28, 2023, 7:47am

That’s awesome! Thanks a lot!!!