I have a model that does multilabel classification. After calculating the outputs in training mode, I iterate over thresholds and choose the one that gives me best fbeta_score.
What I want to ask is do I calculate loss based on those final predictions or do I calculate loss based on the original predictions?
So, is the manual threshold a post-processing step to use only when the model is in eval mode?
Thank you for your help
We generally use threshold while inferencing. I think computing loss on the original predictions (which you model spits out) is the recommended method.
The problem is when using fbeta_score, I get an error:
"ValueError: Classification metrics can’t handle a mix of multilabel-indicator and continuous-multioutput targets
I take it to mean that I need to binarize the predictions (so going from probabilities to 0/1).
To do that I need to decide on a threshold. So, if I’m going to choose one why not choose the one that gives me the best fbeta_score?