Constrain MLP to predict values between 0 and 1

ays · July 27, 2023, 5:45pm

how do i force my MLP model to predict values between 0 and 1(I want to do this because my target value is a continuous value between 0 and 1)

NB: This is a regression problem and my loss is MSE

AlphaBetaGamma96 · July 27, 2023, 7:50pm

Hi @ays,

You could pass your output via a nn.Sigmoid function that’ll force the range to be [0,1)

KFrank · July 28, 2023, 2:13am

Hi Ays and Alpha!

An alternative approach that may or may not train more effectively with a
mean-squared-error loss would be to use the output of your model (that I
assume ranges from -inf to inf) unchanged – that is, not pass is through
a Sigmoid – but pass your target value through torch.logit(), which will
transform your target’s range from [0.0, 1.0] to [-inf, inf].

logit() is the inverse function of sigmoid(). So, in a sense, these two
schemes are comparing the same quantities, but doing so with a transformed
loss function.

(Note that if your target value represents a probability, then treating this
as a regression problem with a mean-squared-error loss, while logically
consistent, will probably not train as well as treating it as a classification
problem with binary cross entropy as the loss.)

Best.

K. Frank

ays · July 29, 2023, 2:48am

Thanks @KFrank and @AlphaBetaGamma96 for the reply. Isn’t it just for classification task that I pass it through a sigmoid (or it doesnt matter and I can use it in this case too)? Also my target ranges between 0 and 1 (because I’m trying to predict a dice score which ranges between 0 and 1), and there is no class so it’s a continues value.

vdw · July 29, 2023, 11:59am

Probably a naive opinion: I wouldn’t try to enforce anything.

You clearly say it’s a regression task. This also means, I assume, that the ground truth values in your training dataset are between 0 and 1. So if the model is trained well, and unseen data samples stems from a similar distribution as the training data, the prediction will be within 0 and 1.

In contrast, if the prediction is indeed outside this range, this might mean that the unseen sample “looks” very different from the training data. And in this case forcing it to be within 0 and 1 doesn’t feel right. Of course, in practice, you can always map negative values to 0 and larger values to 1. But enforcing the boundaries during training seems not a good idea.

After all, this is the common situation that a trained regressor is only valid within the range of the training data. For example, if you train a regressor predicting the resale price of houses based on #rooms, #floors, age, area_size, etc., you can predict the prices for hypothetical houses with 0 rooms or a million rooms. The regressor will give you some prediction, but it’s arguably not very meaningful as this unseen sample is so different compared to the training data. In short, while you can extrapolate with the regressor, the more you extrapolate the less meaningful the prediction might become.

Despite this, I’ve never seen approaches to enforce any lower and upper bound during training. That being said, you could try tree-based models (Decision Trees, Random Forests, Gradient-Boosted Trees, etc.) as those models cannot extrapolate (compared to, say, Logistic Regression). So with tree-based models, your prediction will always be between the lowest and highest value in your training data.

ays · July 29, 2023, 12:15pm

That makes sense. Thanks!