I want to create a model that can give me real value in output.
My input is nx3 and output is in the range -40 to -140
that is my model
class Regressor(nn.Module):
def __init__(self):
super(Regressor, self).__init__()
self.fc1 = nn.Linear(input_size, hidden_size_1)
self.fc2 = nn.Linear(hidden_size_1, hidden_size_2)
self.fc3 = nn.Linear(hidden_size_2, hidden_size_3)
self.fc4 = nn.Linear(hidden_size_3, 1)
def forward(self, x):
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
x = F.relu(self.fc3(x))
x = self.fc4(x)
return x
I am using Adam Optimizer opt = optim.Adam(model3.parameters(), lr=0.001)
Training sequence
epoch_data = []
for epoch in range(10000):
avg_acc_test = 0
avg_acc_train = 0
avg_loss_train = 0
avg_loss_test = 0
for i in range(X_train.shape[0]):
data_x = X_train_tensor[i]
data_y = y_train_tensor[i]
pred = model3(data_x)
# print(pred[:5])
loss = mse_loss(pred, data_y)
opt.zero_grad()
loss.backward()
opt.step()
pred_test = model3(X_test_tensor[0])
loss_test = mse_loss(pred_test, y_test_tensor[0])
avg_loss_train += loss.data
avg_loss_test += loss_test.data
if epoch % 50 == 0:
print('loss test {} loss train {}'.format(avg_loss_test/X_test.shape[0], avg_loss_train/X_train.shape[0] ))
Now the problem is my loss is not converging it always get stuck around 176 and i tried many values of learning rate , different number of layers and different activation functions as well and different number of nodes as well, still it revolves around 176 , and yes i normalised the input data (not the output data)
You might try to normalize the output as well (and denormalize it for the validation/test case).
Also, did you manage to overfit your model on the training data?
I’m not sure which dataset you are using, but you could also try to scale down the model a bit and start with a single hidden layer for the beginning.
Yes i did tried with single hidden layer with only 5-10 neurons as well, but also then the loss was around 179 , and all of my predictions was almost same, while data is quite dispersed.
The data i am using is private.
But the values in the data are quite random.
How large is your dataset? Could you try to use a small sample (e.g. just 10 samples) and try to overfit it first?
If that doesn’t work at all, you could have a bug in your code somewhere or the model architecture might not be suitable for the data.
Hey, your advice to normalize the output data worked , now the predictions are somewhat sense full.
Just curious is it necessary to normalize the output data when working with regression ?
Not necessarily, but normalization might help, e.g. if your output value range is quite large.
Your model might have a hard time to push the parameters to higher values which would be necessary for a high range output.
Sorry if this sounds dumb, but if I did normalise my output values how will I predict the test data? will it be normalised as well? how to recover the actual values?
If you normalize the output during training (and the target of course, too), you might “denormalize” both to get the expected prediction ranges.
E.g. if you’ve standardized them by subtracting the mean and dividing with the stddev: