Unknown

Dataset Information

0

The Fermi-Dirac distribution provides a calibrated probabilistic output for binary classifiers.


ABSTRACT: Binary classification is one of the central problems in machine-learning research and, as such, investigations of its general statistical properties are of interest. We studied the ranking statistics of items in binary classification problems and observed that there is a formal and surprising relationship between the probability of a sample belonging to one of the two classes and the Fermi-Dirac distribution determining the probability that a fermion occupies a given single-particle quantum state in a physical system of noninteracting fermions. Using this equivalence, it is possible to compute a calibrated probabilistic output for binary classifiers. We show that the area under the receiver operating characteristics curve (AUC) in a classification problem is related to the temperature of an equivalent physical system. In a similar manner, the optimal decision threshold between the two classes is associated with the chemical potential of an equivalent physical system. Using our framework, we also derive a closed-form expression to calculate the variance for the AUC of a classifier. Finally, we introduce FiDEL (Fermi-Dirac-based ensemble learning), an ensemble learning algorithm that uses the calibrated nature of the classifier's output probability to combine possibly very different classifiers.

SUBMITTER: Kim SC 

PROVIDER: S-EPMC8403970 | biostudies-literature |

REPOSITORIES: biostudies-literature

Similar Datasets

| S-EPMC10043529 | biostudies-literature
| S-EPMC7051961 | biostudies-literature
| S-EPMC3138069 | biostudies-literature
| S-EPMC7923594 | biostudies-literature
| S-EPMC6994491 | biostudies-literature
| S-EPMC5556122 | biostudies-literature
| S-EPMC6099897 | biostudies-other
| S-EPMC6936362 | biostudies-literature
| S-EPMC5161428 | biostudies-literature
| S-EPMC10724392 | biostudies-literature