Possible reasons for regression problem with almost same prediction result

I have a model that predict four similar category numerical data (target values).

I use a CNN to extract feature for each category target value, before FC layers, I also add two values to the flattened extracted feature maps.

This works in Keras. I rewrite my codes with Pytorch. But the result of prediction for each category are almost same, with a little fluctuation.

I worry that this is caused by the back propagation.
The loss my model is RMSE of four target values of a training batch.

Any ideas?

Hi,

This does not sounds like something you should not do.
Could you share some code? In particular, how do your define your model. And how do you add the extra values before the FC?

concatenate the extra value with the feature map

Here is my model output

		out = torch.cat((x1, x2, x3, x4), 1)
		# print('after_concat', x1.shape)  # (batch_size,4)

Here is my loss function

def RMSELoss_new(yhat, y):
	return torch.mean(torch.sqrt(torch.mean((yhat - y) ** 2, 0)))
criterion = RMSELoss_new

And I use SGD optimizer and in the main training loop, I calculate the loss and optimize the model.

			optimizer.zero_grad()

			outputs = net(inputs, tr_angles)   # (batch_size,4)

			# loss = torch.sqrt(criterion(outputs, labels))
			loss = criterion(outputs, labels)
			# print(loss)  # tensor(107.8965, device='cuda:0', grad_fn=<SqrtBackward>)

			loss.backward()
			optimizer.step()

Here is my understanding:

for i in range(1,epoch+1):
  model.train()
  optimizer.zero_grad()
  outputs = model(inputs)
  loss = criterion(outputs, labels)
  loss.backward()
  optimizer.step()
  if val_print_show_time == i:
     with torch.no_grad():
        model.eval()
        val_outputs = model(val_inputs)
        # early stoping ......

I think I may make a mistake that I should move the model.eval() outside of the with torch.no_grad():

  if val_print_show_time == i:
     model.eval()
     with torch.no_grad():
        val_outputs = model(val_inputs)
        # early stoping ......

But my model’s performance is not good.

For the regression, especially for this kind of multi-value prediction problem, is there any better loss function or optimizer than RMSE and SGD(or Adam)?

OTHER DETAILS

Here is my model output

		out = torch.cat((x1, x2, x3, x4), 1)
		# print('after_concat', x1.shape)  # (batch_size,4)

Here is my loss function

def RMSELoss_new(yhat, y):
	return torch.mean(torch.sqrt(torch.mean((yhat - y) ** 2, 0)))
criterion = RMSELoss_new

And I use SGD optimizer and in the main training loop, I calculate the loss and optimize the model.

			optimizer.zero_grad()

			outputs = net(inputs, tr_angles)   # (batch_size,4)

			# loss = torch.sqrt(criterion(outputs, labels))
			loss = criterion(outputs, labels)
			# print(loss)  # tensor(107.8965, device='cuda:0', grad_fn=<SqrtBackward>)

			loss.backward()
			optimizer.step()

All these look good.

  • Moving the .eval outside of the no_grad() block won’t change anything.
  • If you use batchnorm, you might want to double check how the model behaves without the model.eval(). The saved statistics for the eval mode might not be very good in the middle of the training when the model changes a lot.
  • You can check that the network initialization is what you expect. That could explain the difference with Keras.
1 Like

If we don’t use validation dataset do the evaluation, I am not sure if my model learns or not.
In this case, how can we know the learning progress during the training? or this kind of evaluation is for debug?

Anyway, I will take your suggestions

You can also do evaluation with large batch-size, with the network in training mode. That might help with getting a better idea while the weights are not stable.

you mean I can evaluate the model without using the model.eval().
Is my understanding right? Thanks a lot!

Actually, I use one-batch validation dataset :grinning: :grinning:

.eval() changes the behavior of some modules. For example, Batchnorm uses the saved statistics instead of the current batch’s ones, dropout becomes an identity, etc

1 Like