Dataset Information

Machine Learning Strategies When Transitioning between Biological Assays.

ABSTRACT: Machine learning is widely used in drug development to predict activity in biological assays based on chemical structure. However, the process of transitioning from one experimental setup to another for the same biological endpoint has not been extensively studied. In a retrospective study, we here explore different modeling strategies of how to combine data from the old and new assays when training conformal prediction models using data from hERG and Na_V assays. We suggest to continuously monitor the validity and efficiency of models as more data is accumulated from the new assay and select a modeling strategy based on these metrics. In order to maximize the utility of data from the old assay, we propose a strategy that augments the proper training set of an inductive conformal predictor by adding data from the old assay but only having data from the new assay in the calibration set, which results in valid (well-calibrated) models with improved efficiency compared to other strategies. We study the results for varying sizes of new and old assays, allowing for discussion of different practical scenarios. We also conclude that our proposed assay transition strategy is more beneficial, and the value of data from the new assay is higher, for the harder case of regression compared to classification problems.

SUBMITTER: Arvidsson McShane S

PROVIDER: S-EPMC8317157 | biostudies-literature |

REPOSITORIES: biostudies-literature

ACCESS DATA

Similar Datasets

Project description:Evaluation of reactive astrogliosis by neuroanatomical assays represents a common experimental outcome for neuroanatomists. The literature demonstrates several conflicting results as to the accuracy of such measures. We posited that the diverging results within the neuroanatomy literature were due to suboptimal analytical workflows in addition to astrocyte regional heterogeneity. We therefore generated an automated segmentation workflow to extract features of glial fibrillary acidic protein (GFAP) and aldehyde dehydrogenase family 1, member L1 (ALDH1L1) labeled astrocytes with and without neuroinflammation. We achieved this by capturing multiplexed immunofluorescent confocal images of mouse brains treated with either vehicle or lipopolysaccharide (LPS) followed by implementation of our workflows. Using classical image analysis techniques focused on pixel intensity only, we were unable to identify differences between vehicle-treated and LPS-treated animals. However, when utilizing machine learning-based algorithms, we were able to (1) accurately predict which objects were derived from GFAP or ALDH1L1-stained images indicating that GFAP and ALDH1L1 highlight distinct morphological aspects of astrocytes, (2) we could predict which neuroanatomical region the segmented GFAP or ALDH1L1 object had been derived from, indicating that morphological features of astrocytes change as a function of neuroanatomical location. (3) We discovered a statistically significant, albeit not highly accurate, prediction of which objects had come from LPS versus vehicle-treated animals, indicating that although features exist capable of distinguishing LPS-treated versus vehicle-treated GFAP and ALDH1L1-segmented objects, that significant overlap between morphologies exists. We further determined that for most classification scenarios, nonlinear models were required for improved treatment class designations. We propose that unbiased automated image analysis techniques coupled with well-validated machine learning tools represent highly useful models capable of providing insights into neuroanatomical assays.

Dataset Information

Machine Learning Strategies When Transitioning between Biological Assays.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets