Calculating Precision, Recall and F1 score in case of multi label classification

@ptrblck
I am also working on multi label classification task where I have ground truth labels as one hot encoded. I got predicted values for the sample and also getting loss properly. But when I am trying to compute accuracy as you suggested in the post I am still getting error as “ValueError: Classification metrics can’t handle a mix of unknown and multilabel-indicator targets”.

print('F1: {}'.format(f1_score(labels.data.to('cpu'), outputs.data.to('cpu') > 0.5, average="samples")))