Proper way to constraint the final linear layer to have non negative weights?

I am trying to constraint the final layer of my NN to have non negative weights in the final layer, for my binary classification task ( the reason for me wanting to have non negative weights does not matter right now)

This is basically what my code looks like :

class Classifier(nn.Module):
    def __init__(self, in_dim, hidden_dim1 , hidden_dim2 , hidden_dim3 , n_classes):
        super(Classifier, self).__init__()

        # other layers

        self.classify = nn.Linear(hidden_dim3 , n_classes)

    def forward(self, g , h ):
        # other layers

        hg = self.classify(h) =

        hg = torch.sigmoid ( hg )

        return  hg

so am i doing this right? is this proper way of forcing the final layer to only have positive weights and therefore only looks for “positive” features to do classification ?

wouldn’t there be problems because sigmoid with only positive input only outputs +50% probabilities? the bias should fix this problem, right?

Note that keras has


which does the same thing and i am trying to do that in pytorch.

Hi Richard!

It looks like this is related to your previous post:

My guess is that you’re not going about your problem in a sensible way.
But unless you describe your actual use case, it’s hard to know.

.data is deprecated, and the forum experts will threaten you with
the specter of computation-graph gremlins if you use it.

If you really want to do this, something like:

with torch.no_grad():
    self.classify.weight.copy_ (

might be better.

Is this the key to what you are trying to do? What would it mean to
build a classifier that “only looks for ‘positive’ features?”

Well, if you only look for positive features, I suppose that it would
be natural to only find positive features, and therefore only output
probabilities greater than 50%.

But yes, this does seem problematic to me.

I doubt anything I say where will be relevant to your actual problem,
but here are some observations:

Just because your last layer has only positive weights (as distinct
from biases) doesn’t mean that your output values (even if you had
no biases) would be positive. Negative inputs to a positive-weight
layer will produce negative outputs.

But yes, a negative bias could flip the value of what would have been
a positive output to negative.

However, as it stands, I don’t see how your optimizer would know that
you want positive weights. So even if your biases could “fix” the
problem, your optimizer could well just take steps that leave your
biases alone, and make your weights negative (even though you
then force them by hand to be positive).

Good luck.

K. Frank


I’m trying to implement the idea of this paper for a more robust model :

so the “attacker” adding more benign features will not evade the model, the core idea is that by forcing the weights to be positive only, adding benign features will not affect the output, because only features that drive the model towards positive will affect the output, pretty simple but effective.

they implemented this in keras using


So what is the most optimal way of implementing this in a multi layer NN in pytorch? (for binary classification), meaning how should i force the weights to be 0, what activation function and optimization parameters should i use?

Hi Richard!

According to the keras documentation, Layer weight constraints:

“They are per-variable projection functions applied to the target
variable after each gradient update.”

So following along with what keras claims it does, you could try:

with torch.no_grad():
    self.classify.weight.copy_ (

to force the constraint after each optimizer step.

You would then hope that the training process causes the optimizer
to move the last layer’s bias into negative territory so that you get
predicted logits centered around zero, and therefore make “negative”
as well as “positive” predictions.

Such an approach would seem to be training the model with one hand
tied behind its back, but, in principle, it ought to be able to train the
biases to become negative.

Good luck.

K. Frank

1 Like

Thank you for the answer, any suggestion on which optimizer & loss function to choose? does it really matter in this case? i am currently using :

    criterion = nn.BCELoss()
    optimizer = optim.Adam(model.parameters(), lr=0.001)

Hi Richard!

From your previous posts it appears that you have a multi-label,
multi-class problem. For this, BCELoss is reasonable. For
improved numerical stability, however, you should prefer using
BCEWithLogitsLoss and remove the sigmoid() call from your
forward() function.

Would it be practical for you to experiment with different optimizers,
or is your model too slow and expensive to play around with?

I would recommend starting with plain-vanilla SGD, if only to get a
baseline for comparison.

Both SGD with momentum and Adam “remember” things from
previous steps, so the fact that you will be munging your weights
after the optimization step might confuse this process. It’s not that
you shouldn’t try SGD with momentum or Adam – they might work
fine – but be on the lookout for potential issues and use plain-vanilla
SGD as a sanity check.

In general, weight decay is worthwhile, so you should probably turn
it on (but run a baseline without it).

Your BCEWithLogitsLoss (or BCELoss) loss function should cause
your model to train to make some “negative” predictions (assuming
that your training data is sensible), but your weight-munging scheme
is, at best, going to make your training process more difficult. So you
might have to carry out training runs that are longer than normal.
(Also, experiment with your learning rate.)

Lastly, if your training data is *unbalanced," that is, some given class
has many more “negative” samples than “positive,” you should also
consider using BCEWithLogitsLoss's pos_weight argument to
account for this.

Good luck.

K. Frank