Confusion Matrix for multi-class classification problem

I am working on a 4-class classification problem. I have got some results using ResNet. I want to evaluate the results further by a confusion matrix. I have seen sklearn has the method to calculate it. Since I am using CNN, which gives results for multiple epochs, how can I plot the confusion matrix for the best epoch?

Usually you would calculate the validation loss and accuracy after each epoch and could also store the predictions for the entire validation dataset and calculate the confusion matrix afterwards.
If you want to only compute the confusion matrix for the “best” model model, store the state_dict of the model if the validation loss decreases (or the validation accuracy increases), reload the state_dict after the training is finished, and calculate the confusion matrix afterwards.

@ptrblck so just like accuracy and validation loss is calculated for every epoch, other evaluation measures like confusion matrix, F1-score, recall, precision are also calculated for every epoch? Or do we calculate these measures only for the best model?

I don’t think there is any rule and it depends on your use case, i.e. the calculation of each of these metric stats will of course slow down your code a bit and you would have to decide, if you want and need to track these additional stats for each epoch.
The “slowdown” of course also depends on the general use case. E.g. if an epoch already takes ~1h to finish, you wouldn’t notice the additional calculations even if they would take ~1min (I would assume they are faster as well).