Constrained Loss Function

Suppose I have a neural network model which outputs a single positive scalar value.

Every training instance in my data has two ordered input tensors (x1, x2). The outputs from the model are as follows:

y1 = model(x1)
y2 = model(x2)

How to design a loss function to make sure that y1 - y2 > 0? Additionally, I would like to achieve this without pushing the value of y2 towards 0. For my particular use-case, we don’t care about how large (y1-y2) is, just have to make sure that it is positive. What should be a suitable loss function to achieve this constraint on the outputs?

Hi Gdx!

First a comment:

Because neural networks are trained by “backpropagating” a
differentiable loss function using gradient-descent methods, it
is much more natural to use a differentiable penalty to penalize
violations of your desired constraint, rather than impose a hard
constraint. Note that such an approach does permit your constraint
to be violated, but if your penalty is large enough, your constraint
won’t be violated by very much.

Along these lines, consider using:

penalty_loss = torch.exp (alpha * (y2 - y1))

(This is a straightforward choice for the penalty function but many
other choices would also make sense.)

The greater you make alpha the more you will penalize values
for which y1 - y2 < 0 (and the less the network will care about
the value of y1 - y2, as long as it is greater than zero).

I might imagine starting training with alpha set to a moderate value,
and then increasing it as training progresses.


K. Frank