Dataset Information

A Bayesian method for comparing and combining binary classifiers in the absence of a gold standard.

ABSTRACT:

Background

Many problems in bioinformatics involve classification based on features such as sequence, structure or morphology. Given multiple classifiers, two crucial questions arise: how does their performance compare, and how can they best be combined to produce a better classifier? A classifier can be evaluated in terms of sensitivity and specificity using benchmark, or gold standard, data, that is, data for which the true classification is known. However, a gold standard is not always available. Here we demonstrate that a Bayesian model for comparing medical diagnostics without a gold standard can be successfully applied in the bioinformatics domain, to genomic scale data sets. We present a new implementation, which unlike previous implementations is applicable to any number of classifiers. We apply this model, for the first time, to the problem of finding the globally optimal logical combination of classifiers.

Results

We compared three classifiers of protein subcellular localisation, and evaluated our estimates of sensitivity and specificity against estimates obtained using a gold standard. The method overestimated sensitivity and specificity with only a small discrepancy, and correctly ranked the classifiers. Diagnostic tests for swine flu were then compared on a small data set. Lastly, classifiers for a genome-wide association study of macular degeneration with 541094 SNPs were analysed. In all cases, run times were feasible, and results precise. The optimal logical combination of classifiers was also determined for all three data sets. Code and data are available from http://bioinformatics.monash.edu.au/downloads/.

Conclusions

The examples demonstrate the methods are suitable for both small and large data sets, applicable to the wide range of bioinformatics classification problems, and robust to dependence between classifiers. In all three test cases, the globally optimal logical combination of the classifiers was found to be their union, according to three out of four ranking criteria. We propose as a general rule of thumb that the union of classifiers will be close to optimal.

SUBMITTER: Keith JM

PROVIDER: S-EPMC3473310 | biostudies-literature | 2012 Jul

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

A Bayesian method for comparing and combining binary classifiers in the absence of a gold standard.

Keith Jonathan M JM Davey Christian M CM Boyd Sarah E SE

BMC bioinformatics 20120727

<h4>Background</h4>Many problems in bioinformatics involve classification based on features such as sequence, structure or morphology. Given multiple classifiers, two crucial questions arise: how does their performance compare, and how can they best be combined to produce a better classifier? A classifier can be evaluated in terms of sensitivity and specificity using benchmark, or gold standard, data, that is, data for which the true classification is known. However, a gold standard is not alway ...[more]

PMID: 22838505

Similar Datasets

Project description:ObjectiveThe Bayesian model plays an important role in diagnostic test evaluation in the absence of the gold standard, which used the external prior distribution of a parameter combined with sample data to yield the posterior distribution of the test characteristics. However, the correlation between diagnostic tests has always been a problem that cannot be ignored in the Bayesian model evaluation. This study will discuss how different Bayesian model, correlation scenarios, and prior distribution affect the outcome.MethodsThe data analyzed in this study was gathered during studies of patients presenting to the Nanjing Chest Hospital with suspected tuberculosis. The diagnostic character of T-SPOT.Tb and KD38 tuberculosis antibody test were evaluated in different Bayesian model, and discharge diagnosis as a gold standard was used to verify the model results in the end.ResultThe comparison of four models under the conditional independence situation found that Bayesian probabilistic constraint model was consistent with the Conditional Covariance Bayesian model. The results were mainly affected by prior information. The sensitivity and specificity of the two tests in Conditional Covariance Bayesian model in prior constraint situation were considerably higher than the Bayesian probabilistic constraint model in prior constraint situation. The results of the four models under the conditional dependence situation were similar to the conditional independence situation; pD was also negative with no prior constraint situation in both model Bayesian probabilistic constraint model and Conditional Covariance Bayesian model. The Deviance Information Criterion of Bayesian probabilistic constraint model was close to model Conditional Covariance Bayesian model, but pD of Conditional Covariance Bayesian model in Prior constraint situation (pD=2.40) was higher than the Bayesian probabilistic constraint model in Prior constraint situation (pD=1.66).ConclusionThe result of Conditional Covariance Bayesian model in prior constraint with conditional independence situation was closest to the result of gold standard evaluation in our data. Both of the two Bayesian methods are the feasible way for the evaluation of diagnostic test in the absence of the gold standard diagnostic. Prior source, priority number, and conditional dependencies should be considered in the method selection, the accuracy of posterior estimation mainly depending on the prior distribution.

Project description:BackgroundSpatial modeling is increasingly utilized to elucidate relationships between demographic, environmental, and socioeconomic factors, and infectious disease prevalence data. However, there is a paucity of studies focusing on spatio-temporal modeling that take into account the uncertainty of diagnostic techniques.Methodology/principal findingsWe obtained Schistosoma japonicum prevalence data, based on a standardized indirect hemagglutination assay (IHA), from annual reports from 114 schistosome-endemic villages in Dangtu County, southeastern part of the People's Republic of China, for the period 1995 to 2004. Environmental data were extracted from satellite images. Socioeconomic data were available from village registries. We used Bayesian spatio-temporal models, accounting for the sensitivity and specificity of the IHA test via an equation derived from the law of total probability, to relate the observed with the 'true' prevalence. The risk of S. japonicum was positively associated with the mean land surface temperature, and negatively correlated with the mean normalized difference vegetation index and distance to the nearest water body. There was no significant association between S. japonicum and socioeconomic status of the villages surveyed. The spatial correlation structures of the observed S. japonicum seroprevalence and the estimated infection prevalence differed from one year to another. Variance estimates based on a model adjusted for the diagnostic error were larger than unadjusted models. The generated prediction map for 2005 showed that most of the former and current infections occur in close proximity to the Yangtze River.Conclusion/significanceBayesian spatial-temporal modeling incorporating diagnostic uncertainty is a suitable approach for risk mapping S. japonicum prevalence data. The Yangtze River and its tributaries govern schistosomiasis transmission in Dangtu County, but spatial correlation needs to be taken into consideration when making risk prediction at small scales.

Project description:BackgroundOccupational stress is associated with adverse outcomes for medical professionals and patients. In our cross-sectional study with 136 general practices, 26.4% of 550 practice assistants showed high chronic stress. As machine learning strategies offer the opportunity to improve understanding of chronic stress by exploiting complex interactions between variables, we used data from our previous study to derive the best analytic model for chronic stress: four common machine learning (ML) approaches are compared to a classical statistical procedure.MethodsWe applied four machine learning classifiers (random forest, support vector machine, K-nearest neighbors', and artificial neural network) and logistic regression as standard approach to analyze factors contributing to chronic stress in practice assistants. Chronic stress had been measured by the standardized, self-administered TICS-SSCS questionnaire. The performance of these models was compared in terms of predictive accuracy based on the 'operating area under the curve' (AUC), sensitivity, and positive predictive value.FindingsCompared to the standard logistic regression model (AUC 0.636, 95% CI 0.490-0.674), all machine learning models improved prediction: random forest +20.8% (AUC 0.844, 95% CI 0.684-0.843), artificial neural network +12.4% (AUC 0.760, 95% CI 0.605-0.777), support vector machine +15.1% (AUC 0.787, 95% CI 0.634-0.802), and K-nearest neighbours +7.1% (AUC 0.707, 95% CI 0.556-0.735). As best prediction model, random forest showed a sensitivity of 99% and a positive predictive value of 79%. Using the variable frequencies at the decision nodes of the random forest model, the following five work characteristics influence chronic stress: too much work, high demand to concentrate, time pressure, complicated tasks, and insufficient support by practice leaders.ConclusionsRegarding chronic stress prediction, machine learning classifiers, especially random forest, provided more accurate prediction compared to classical logistic regression. Interventions to reduce chronic stress in practice personnel should primarily address the identified workplace characteristics.

Dataset Information

A Bayesian method for comparing and combining binary classifiers in the absence of a gold standard.

Background

Results

Conclusions

Publications

A Bayesian method for comparing and combining binary classifiers in the absence of a gold standard.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets