I currently have a trained system with a Softmax as the activation function, which returns a vector of probabilities of each class, all together suming 1 (example: [0.6. 0.1, 0.15, 0.05, 0.1])
I was using this model to perform a multi class classification, but i’d like to try it with a multi label approach, and i was wondering if exists any activation function that returns a number of the probability of each class being correct, independently of the others (not suming 1 necessarily. example: [0.8, 0.2, 0.1, 0.7, 0.15])
sigmoid() should do what you want.
A couple of points:
I imagine that you have a
Linear layer feeding your activation function.
Linear layer produces a set raw-score
these to a set class probabilities that sum to one.
each logit individually to the (binary) probability of the corresponding
class be “active” or “inactive.”
No (built-in) loss function can be used directly with
the single-label, multi-class case, you should either feed the logits to
CrossEntropyLoss, or use
log_softmax() as the activation function,
and pass its output to the
NLLLoss loss function.
For the multi-label, multi-class case, you can feed
BCELoss, but you will be better off, for numerical reasons feeding the
I wouldn’t just take your trained single-label model and replace
sigmoid() to make multi-label predictions. You
can start with your trained single-label model, and use it as a
pre-trained starting point for further training of your multi-label