I’m dealing with a dataset that consists of 100 images with target values around (0-10), when I train my ML model on this I get proper convergence to an RMSE of 0.001. However, when I need to retrain this model and include a few data points orders of magnitude larger(1k-10k), I get, unsurprisingly, significant large losses that start around an RMSE of 300+ and decay to at most 5. Does anyone have any thoughts/approaches to dealing with regression models that contain target data that vary on orders of magnitude? Thanks!

I’m not sure, if your new dataset only contains values in the larger range or just a few samples.

In the former case, you could try to normalize the target to a specific range (e.g. z-score) and revert the normalization for the actual prediction for your validation and test set.

In the latter case, this approach would also be possible, but I’m afraid the smaller values might be treated as noise and you might overfit on the larger values.

The scenario is the latter of the cases. I have already tried scaling the targets, but as you mentioned - the small large values dominate and the smaller values despite being the majority are treated as noise. Not too sure if batch normalization would help here? Or possibly working in logspace?