Ontology highlight
ABSTRACT: Introduction
Lack of participation in clinical trials (CTs) is a major barrier for the evaluation of new pharmaceuticals and devices. Here we report the results of the analysis of a dataset from ResearchMatch, an online clinical registry, using supervised machine learning approaches and a deep learning approach to discover characteristics of individuals more likely to show an interest in participating in CTs.Methods
We trained six supervised machine learning classifiers (Logistic Regression (LR), Decision Tree (DT), Gaussian Naïve Bayes (GNB), K-Nearest Neighbor Classifier (KNC), Adaboost Classifier (ABC) and a Random Forest Classifier (RFC)), as well as a deep learning method, Convolutional Neural Network (CNN), using a dataset of 841,377 instances and 20 features, including demographic data, geographic constraints, medical conditions and ResearchMatch visit history. Our outcome variable consisted of responses showing specific participant interest when presented with specific clinical trial opportunity invitations ('yes' or 'no'). Furthermore, we created four subsets from this dataset based on top self-reported medical conditions and gender, which were separately analysed.Results
The deep learning model outperformed the machine learning classifiers, achieving an area under the curve (AUC) of 0.8105.Conclusions
The results show sufficient evidence that there are meaningful correlations amongst predictor variables and outcome variable in the datasets analysed using the supervised machine learning classifiers. These approaches show promise in identifying individuals who may be more likely to participate when offered an opportunity for a clinical trial.
SUBMITTER: Vazquez J
PROVIDER: S-EPMC8057403 | biostudies-literature |
REPOSITORIES: biostudies-literature