# Maximize the loss of non-target classes

I am trying to design an ensemble loss function for an autoencoder to minimize the loss of the target class as well as maximize the loss of the non-target class.

My current loss function works fine with just minimizing loss `target_loss` but starts giving `nan` values when I add the `non_target_loss` term to the overall loss.

``````target_loss = MSE(input, output) + sparsity_term
non_target_loss = MSE(input, output_hat) + sparsity_term_hat

# I subtract because the second term has to be maximized.
loss = target_loss - (Coeff)*non_target_loss.
``````

I am unsure about my design, any suggestions for a better design?

Hi Inferno!

This is to be expected. `mse_loss()` is bounded below (by zero), so
minimizing it wonâ€™t cause anything to diverge â€“ the most you can do
is drive it to zero.

However, because `mse_loss()` is unbounded above, your non-target
term, basically `-mse_loss`, is unbounded below, so minimizing the
non-target term can and will diverge, hence the `nan`s.

Assuming that your basic approach makes sense, you could consider
running your `non_target_loss` through something like a `sigmoid()`
before subtracting it from your total `loss`:

``````loss = target_loss - (Coeff) * torch.nn.functional.sigmoid (non_target_loss)
``````

This total `loss` will still penalize your model for matching the non-target
class. `non_target_loss` can still become arbitrarily large, but, because
it is â€śsoftlyâ€ť clipped by the `sigmoid()`, it wonâ€™t ever cause your total
`loss` to become arbitrarily negative, and your training (almost certainly)
wonâ€™t diverge.

Best.

K. Frank

1 Like

@KFrank Thatâ€™s a nice way out. It works now. Thanks a lot.