Browse
Submit Data
Databases
API
Help

Dataset Information

28 Views

0 Connections

0 Citations

0 Reanalyses

0 Downloads

Omics score: 0

A hybrid framework for improving uncertainty quantification in deep learning-based QSAR regression modeling.

ABSTRACT: Reliable uncertainty quantification for statistical models is crucial in various downstream applications, especially for drug design and discovery where mistakes may incur a large amount of cost. This topic has therefore absorbed much attention and a plethora of methods have been proposed over the past years. The approaches that have been reported so far can be mainly categorized into two classes: distance-based approaches and Bayesian approaches. Although these methods have been widely used in many scenarios and shown promising performance with their distinct superiorities, being overconfident on out-of-distribution examples still poses challenges for the deployment of these techniques in real-world applications. In this study we investigated a number of consensus strategies in order to combine both distance-based and Bayesian approaches together with post-hoc calibration for improved uncertainty quantification in QSAR (Quantitative Structure-Activity Relationship) regression modeling. We employed a set of criteria to quantitatively assess the ranking and calibration ability of these models. Experiments based on 24 bioactivity datasets were designed to make critical comparison between the model we proposed and other well-studied baseline models. Our findings indicate that the hybrid framework proposed by us can robustly enhance the model ability of ranking absolute errors. Together with post-hoc calibration on the validation set, we show that well-calibrated uncertainty quantification results can be obtained in domain shift settings. The complementarity between different methods is also conceptually analyzed.

SUBMITTER: Wang D

PROVIDER: S-EPMC8454160 | biostudies-literature |

REPOSITORIES: biostudies-literature

ACCESS DATA

Json Xml

Similar Datasets

Statistical Framework for Uncertainty Quantification in Computational Molecular Modeling.

Project description:As computational modeling, simulation, and predictions are becoming integral parts of biomedical pipelines, it behooves us to emphasize the reliability of the computational protocol. For any reported quantity of interest (QOI), one must also compute and report a measure of the uncertainty or error associated with the QOI. This is especially important in molecular modeling, since in most practical applications the inputs to the computational protocol are often noisy, incomplete, or low-resolution. Unfortunately, currently available modeling tools do not account for uncertainties and their effect on the final QOIs with sufficient rigor. We have developed a statistical framework that expresses the uncertainty of the QOI as the probability that the reported value deviates from the true value by more than some user-defined threshold. First, we provide a theoretical approach where this probability can be bounded using Azuma-Hoeffding like inequalities. Second, we approximate this probability empirically by sampling the space of uncertainties of the input and provide applications of our framework to bound uncertainties of several QOIs commonly used in molecular modeling. Finally, we also present several visualization techniques to effectively and quantitavely visualize the uncertainties: in the input, final QOIs, and also intermediate states.

| S-EPMC6857703 | biostudies-literature

Statistical Framework for Uncertainty Quantification in Computational Molecular Modeling.

| S-EPMC5710766 | biostudies-literature

Reliable deep-learning-based phase imaging with uncertainty quantification.

Project description:Emerging deep-learning (DL)-based techniques have significant potential to revolutionize biomedical imaging. However, one outstanding challenge is the lack of reliability assessment in the DL predictions, whose errors are commonly revealed only in hindsight. Here, we propose a new Bayesian convolutional neural network (BNN)-based framework that overcomes this issue by quantifying the uncertainty of DL predictions. Foremost, we show that BNN-predicted uncertainty maps provide surrogate estimates of the true error from the network model and measurement itself. The uncertainty maps characterize imperfections often unknown in real-world applications, such as noise, model error, incomplete training data, and out-of-distribution testing data. Quantifying this uncertainty provides a per-pixel estimate of the confidence level of the DL prediction as well as the quality of the model and data set. We demonstrate this framework in the application of large space-bandwidth product phase imaging using a physics-guided coded illumination scheme. From only five multiplexed illumination measurements, our BNN predicts gigapixel phase images in both static and dynamic biological samples with quantitative credibility assessment. Furthermore, we show that low-certainty regions can identify spatially and temporally rare biological phenomena. We believe our uncertainty learning framework is widely applicable to many DL-based biomedical imaging techniques for assessing the reliability of DL predictions.

| S-EPMC8329751 | biostudies-literature

Hybrid Classification/Regression Approach to QSAR Modeling of Stoichiometric Antiradical Capacity Assays' Endpoints.

Project description:Quantitative structure-activity relationships (QSAR) are a widely used methodology allowing not only a better understanding of the mechanisms of chemical reactions, including radical scavenging, but also to predict the relevant properties of chemical compounds without their synthesis, isolation and experimental testing. Unlike the QSAR modeling of the kinetic antioxidant assays, modeling of the assays with stoichiometric endpoints depends strongly on the number of hydroxyl groups in the antioxidant molecule, as well as on some integral molecular descriptors characterizing the proportion of OH-groups able to enter and complete the radical scavenging reaction. In this work, we tested the feasibility of a "hybrid" classification/regression approach, consisting of explicit classification of individual OH-groups as involved in radical scavenging reactions, and using further the number of these OH-groups as a descriptor in simple-regression QSAR models of antiradical capacity assays with stoichiometric endpoints. A simple threshold classification based on the sum of trolox-equivalent antiradical capacity values was used, selecting OH-groups with specific radical stability- and reactivity-related electronic parameters or their combination as "active" or "inactive". We showed that this classification/regression modeling approach provides a substantial improvement of the simple-regression QSAR models over those built on the number of total phenolic OH-groups only, and yields a statistical performance similar to that of the best reported multiple-regression QSARs for antiradical capacity assays with stoichiometric endpoints.

| S-EPMC9000788 | biostudies-literature

The Quest for Model Uncertainty Quantification: A Hybrid Ensemble and Variational Data Assimilation Framework.

Project description:This article presents a novel approach to couple a deterministic four-dimensional variational (4DVAR) assimilation method with the particle filter (PF) ensemble data assimilation system, to produce a robust approach for dual-state-parameter estimation. In our proposed method, the Hybrid Ensemble and Variational Data Assimilation framework for Environmental systems (HEAVEN), we characterize the model structural uncertainty in addition to model parameter and input uncertainties. The sequential PF is formulated within the 4DVAR system to design a computationally efficient feedback mechanism throughout the assimilation period. In this framework, the 4DVAR optimization produces the maximum a posteriori estimate of state variables at the beginning of the assimilation window without the need to develop the adjoint of the forecast model. The 4DVAR solution is then perturbed by a newly defined prior error covariance matrix to generate an initial condition ensemble for the PF system to provide more accurate and reliable posterior distributions within the same assimilation window. The prior error covariance matrix is updated from one cycle to another over the main assimilation period to account for model structural uncertainty resulting in an improved estimation of posterior distribution. The premise of the presented approach is that it (1) accounts for all sources of uncertainties involved in hydrologic predictions, (2) uses a small ensemble size, and (3) precludes the particle degeneracy and sample impoverishment. The proposed method is applied on a nonlinear hydrologic model and the effectiveness, robustness, and reliability of the method is demonstrated for several river basins across the United States.

| S-EPMC6559328 | biostudies-literature

Federated learning framework integrating REFINED CNN and Deep Regression Forests

Project description:Abstract Summary Predictive learning from medical data incurs additional challenge due to concerns over privacy and security of personal data. Federated learning, intentionally structured to preserve high level of privacy, is emerging to be an attractive way to generate cross-silo predictions in medical scenarios. However, the impact of severe population-level heterogeneity on federated learners is not well explored. In this article, we propose a methodology to detect presence of population heterogeneity in federated settings and propose a solution to handle such heterogeneity by developing a federated version of Deep Regression Forests. Additionally, we demonstrate that the recently conceptualized REpresentation of Features as Images with NEighborhood Dependencies CNN framework can be combined with the proposed Federated Deep Regression Forests to provide improved performance as compared to existing approaches. Availability and implementation The Python source code for reproducing the main results are available on GitHub: https://github.com/DanielNolte/FederatedDeepRegressionForests. Contact ranadip.pal@ttu.edu Supplementary information Supplementary data are available at Bioinformatics Advances online.

| S-EPMC10074025 | biostudies-literature

Automatic calculation of myocardial perfusion reserve using deep learning with uncertainty quantification.

Project description:BackgroundMyocardial perfusion reserve index (MPRI) in magnetic resonance imaging (MRI) is an important indicator of ischemia, and its measurement typically involves manual procedures. The purposes of this study were to develop a fully automatic method for estimating the MPRI and to evaluate its performance.MethodsThe method consisted of segmenting the myocardium in dynamic contrast-enhanced (DCE) myocardial perfusion MRI data using Monte Carlo dropout U-Net, dividing the myocardium into segments based on landmark localization with machine learning, and estimating the MPRI after the calculation of the left ventricular and myocardial contrast upslopes. The proposed method was compared with a reference method, which involved manual adjustments of the myocardial contours and upslope ranges.ResultsIn test subjects, MPRIs measured by the proposed technique correlated with those by the manual reference in segmental assessment [intraclass correlation coefficient (ICC) =0.75, 95% CI: 0.70-0.79, P<0.001]. The automatic and reference MPRI values showed a mean difference of -0.02 and 95% limits of agreement of (-0.86, 0.82).ConclusionsThe proposed automatic method is based on deep learning segmentation and machine learning landmark detection for MPRI measurements in DCE perfusion MRI. It holds the potential to efficiently and quantitatively assess myocardial ischemia without any user's interaction.

| S-EPMC10722070 | biostudies-literature

Deep spectral learning for label-free optical imaging oximetry with uncertainty quantification.

Project description:Measurement of blood oxygen saturation (sO2) by optical imaging oximetry provides invaluable insight into local tissue functions and metabolism. Despite different embodiments and modalities, all label-free optical-imaging oximetry techniques utilize the same principle of sO2-dependent spectral contrast from haemoglobin. Traditional approaches for quantifying sO2 often rely on analytical models that are fitted by the spectral measurements. These approaches in practice suffer from uncertainties due to biological variability, tissue geometry, light scattering, systemic spectral bias, and variations in the experimental conditions. Here, we propose a new data-driven approach, termed deep spectral learning (DSL), to achieve oximetry that is highly robust to experimental variations and, more importantly, able to provide uncertainty quantification for each sO2 prediction. To demonstrate the robustness and generalizability of DSL, we analyse data from two visible light optical coherence tomography (vis-OCT) setups across two separate in vivo experiments on rat retinas. Predictions made by DSL are highly adaptive to experimental variabilities as well as the depth-dependent backscattering spectra. Two neural-network-based models are tested and compared with the traditional least-squares fitting (LSF) method. The DSL-predicted sO2 shows significantly lower mean-square errors than those of the LSF. For the first time, we have demonstrated en face maps of retinal oximetry along with a pixel-wise confidence assessment. Our DSL overcomes several limitations of traditional approaches and provides a more flexible, robust, and reliable deep learning approach for in vivo non-invasive label-free optical oximetry.

| S-EPMC6864044 | biostudies-literature

An uncertainty-based interpretable deep learning framework for predicting breast cancer outcome.

Project description:BackgroundPredicting outcome of breast cancer is important for selecting appropriate treatments and prolonging the survival periods of patients. Recently, different deep learning-based methods have been carefully designed for cancer outcome prediction. However, the application of these methods is still challenged by interpretability. In this study, we proposed a novel multitask deep neural network called UISNet to predict the outcome of breast cancer. The UISNet is able to interpret the importance of features for the prediction model via an uncertainty-based integrated gradients algorithm. UISNet improved the prediction by introducing prior biological pathway knowledge and utilizing patient heterogeneity information.ResultsThe model was tested in seven public datasets of breast cancer, and showed better performance (average C-index = 0.691) than the state-of-the-art methods (average C-index = 0.650, ranged from 0.619 to 0.677). Importantly, the UISNet identified 20 genes as associated with breast cancer, among which 11 have been proven to be associated with breast cancer by previous studies, and others are novel findings of this study.ConclusionsOur proposed method is accurate and robust in predicting breast cancer outcomes, and it is an effective way to identify breast cancer-associated genes. The method codes are available at: https://github.com/chh171/UISNet .

| S-EPMC10902951 | biostudies-literature

A statistical framework for quantification and visualisation of positional uncertainty in deep brain stimulation electrodes.

Project description:Deep brain stimulation (DBS) is an established therapy for treating patients with movement disorders such as Parkinson's disease. Patient-specific computational modelling and visualisation have been shown to play a key role in surgical and therapeutic decisions for DBS. The computational models use brain imaging, such as magnetic resonance (MR) and computed tomography (CT), to determine the DBS electrode positions within the patient's head. The finite resolution of brain imaging, however, introduces uncertainty in electrode positions. The DBS stimulation settings for optimal patient response are sensitive to the relative positioning of DBS electrodes to a specific neural substrate (white/grey matter). In our contribution, we study positional uncertainty in the DBS electrodes for imaging with finite resolution. In a three-step approach, we first derive a closed-form mathematical model characterising the geometry of the DBS electrodes. Second, we devise a statistical framework for quantifying the uncertainty in the positional attributes of the DBS electrodes, namely the direction of longitudinal axis and the contact-centre positions at subvoxel levels. The statistical framework leverages the analytical model derived in step one and a Bayesian probabilistic model for uncertainty quantification. Finally, the uncertainty in contact-centre positions is interactively visualised through volume rendering and isosurfacing techniques. We demonstrate the efficacy of our contribution through experiments on synthetic and real datasets. We show that the spatial variations in true electrode positions are significant for finite resolution imaging, and interactive visualisation can be instrumental in exploring probabilistic positional variations in the DBS lead.

| S-EPMC6559743 | biostudies-literature

OmicsDI is part of the ELIXIR infrastructure

OmicsDI is an Elixir interoperability service. Learn more ›

Tweets

OmicsDI Databases

PRIDE
PeptideAtlas
MassIVE
JPOST Repository
Physiome Model Repository

EGA
EVA
ENA
LINCS
PAXDB
Cell Collective

MetaboLights
Metabolomics Workbench
MetabolomeExpress
GNPS
BioModels
FAIRDOMHub

ArrayExpress
dbGaP
ExpressionAtlas
GEO
NODE

Information

Databases
Help
API
Contact us
Code on GitHub
Terms of use
Submit Data