Several papers have demonstrated that minimizing cross entropy or MSE does not necessarily maximize the area under the ROC curve (AUC). Are there any differentiable loss functions in PyTorch that can be used as a proxy for AUC?
Two papers have excellent proposals:
- ICML 2003 - Approximation to the Wilcoxon-Mann-Whitney Statistic Paper link here
- Scalable Learning of Non-Decomposable Objectives: A new Arxiv 2020 paper by Elad @ Google managed to bump AUC-PR from 84% to 95%. Paper link here
(1) is available as a loss function on Keras, and I am wondering if something similar is available in the PyTorch ecosystem?