Nan values in confusion matrix

JanaR · October 29, 2021, 7:05am

Hi,
I’ve got a warning that there are 5 nan values in confusion matrix that were replaced with 0.
What does that mean? What does it mean to have a nan values in the confusion matrix?

ptrblck · October 29, 2021, 7:09am

Could you describe your use case a bit more and which method raises the issue?
Based on your current description I guess you are using a 3rd party package (e.g. sklearn.metrics.confusion_matrix) which tries to normalize the confusion matrix and might be running into a zero division?

JanaR · October 29, 2021, 7:57am

Hi @ptrblck ,
this is the line of code I am using:

 confusion_matrix = tm.ConfusionMatrix(num_classes=nb_classes, threshold=0.5, normalize='true', multilabel=True)

and this is the warning:
Testing: 0it [00:00, ?it/s]C:\ProgramData\Miniconda3\lib\site-packages\torchmetrics\utilities\prints.py:37: UserWarning: 16 nan values found in confusion matrix have been replaced with zeros.

ptoews · December 3, 2021, 3:36pm

I had the same problem, because when I forwarded the ConfusionMatrix metric (metric()) at every training step it hasn’t had encountered every class yet, so it couldn’t properly normalize the counts (dividing zero by zero). I solved it by only updating the metric (metric.update()) at every training step, and since I log only every 50 training steps where every class has been encountered by then, I have no warnings anymore.