How to make cost function out of confusion matrix?

sonnguyen · February 25, 2019, 2:51am

Hi all,
I build a model which returns

y =net(x)
which is a binary tensor, and the correct target is a tensor t
I want to train my model using a cost function which is formed from the confusion matrix.
How can I make the cost function using the values from confusion matrix such as TP, FP, TN, FN?
My cost function could be (1-TPR)^2 + FPR^2, where TPR = TP/(FN + TP) and FPR = FP/(FP + TN).

rasbt · February 25, 2019, 3:01am

making a cost function out of a confusion matrix doesn’t make sense to me – how do you differentiate that? It may be useful as a weighting term though. You can check into classification cost matrix: I.e., you can add higher weights for certain classes based on the confusion matrix cells. By default, your classification cost matrix is a diagonal matrix of 0’s.

Kushaj · February 26, 2019, 1:55pm

I think you are confusing the use of confusion matrix.

Suppose you have a classification problem and you get 97% accuracy, but now how do you know if 97% is good or not. Now you use confusion matrix to see the different values. Why is this important? Think of a skewed problem where it is true 99% of the time. In that case getting a accuracy of 99% is not good as you get the same accuracy by predicting the value True everytime.

So you use confusion matrix as a tool to quantify the trust in the accuracy you are getting from your model and depending upon your classification problem you use the numbers from the confusion matrix to further fine-tune or train completely new models.

But you cannot use confusion matrix as your cost function as explained by @rasbt

sonnguyen · February 26, 2019, 10:18pm

Confusion matrix is used to evaluate my model, while I trained it using some cost functions such as MSE or Cross Entropy. It is not good from mathematics point of view, since the model is trying to minimizing a value which is not the value used to evaluate it.
So I was wondering if there is a cost function which directly minimize a value in confusion matrix, so seem like the answer is a No.

sonnguyen · February 26, 2019, 10:21pm

Thanks much for your answer.
It’s just that my model was evaluated by confusion matrix, while in training I have to choose something else to minimize.
I was wondering if there is a way to minimize directly the confusion matrix, now I know that the answer is No.

tom · February 26, 2019, 11:20pm

Note that cross-entropy and accuracy are quite strongly related - if you take the output as probabilities for the various answers and build a ‘soft’ confusion matrix from them, cross-entropies penalise precisely the off-diagonal
entries.

Best regards

Thomas