Calculating Precision, Recall and F1 score in case of multi label classification

I am using scikit learn metrics for this and used this code:

print('F1: {}'.format(f1_score(outGT, outPRED, average="samples")))
print('Precision: {}'.format(precision_score(outGT, outPRED, average="samples")))
print('Recall: {}'.format(recall_score(outGT, outPRED, average="samples")))

This is throwing this error:

ValueError: Classification metrics can't handle a mix of multilabel-indicator and continuous-multioutput targets

The output of

print('Ground Truth: {}'.format(outGT))
print('Predicted Truth: {}'.format(outPRED))

is as below:

Ground Truth: 
    0     0     0  ...      0     0     0
    0     0     0  ...      0     0     0
    0     0     0  ...      0     0     0
       ...          ⋱          ...       
    1     0     0  ...      0     0     0
    1     0     0  ...      0     0     0
    0     0     0  ...      0     0     0
[torch.cuda.FloatTensor of size 22433x14 (GPU 0)]

Predicted Truth: 
 0.0901  0.0916  0.0389  ...   0.0021  0.0078  0.0016
 0.0424  0.0084  0.0111  ...   0.0053  0.0079  0.0025
 0.0611  0.0205  0.0206  ...   0.0024  0.0074  0.0018
          ...             ⋱             ...          
 0.3588  0.0223  0.1421  ...   0.0036  0.0094  0.0035
 0.1782  0.0226  0.2275  ...   0.0033  0.0129  0.0016
 0.2574  0.0176  0.2255  ...   0.0034  0.0118  0.0023
[torch.cuda.FloatTensor of size 22433x14 (GPU 0)]

1 Like