Dataset Information

Clustering clinical trials with similar eligibility criteria features.

ABSTRACT: To automatically identify and cluster clinical trials with similar eligibility features.Using the public repository ClinicalTrials.gov as the data source, we extracted semantic features from the eligibility criteria text of all clinical trials and constructed a trial-feature matrix. We calculated the pairwise similarities for all clinical trials based on their eligibility features. For all trials, by selecting one trial as the center each time, we identified trials whose similarities to the central trial were greater than or equal to a predefined threshold and constructed center-based clusters. Then we identified unique trial sets with distinctive trial membership compositions from center-based clusters by disregarding their structural information.From the 145,745 clinical trials on ClinicalTrials.gov, we extracted 5,508,491 semantic features. Of these, 459,936 were unique and 160,951 were shared by at least one pair of trials. Crowdsourcing the cluster evaluation using Amazon Mechanical Turk (MTurk), we identified the optimal similarity threshold, 0.9. Using this threshold, we generated 8806 center-based clusters. Evaluation of a sample of the clusters by MTurk resulted in a mean score 4.331±0.796 on a scale of 1-5 (5 indicating "strongly agree that the trials in the cluster are similar").We contribute an automated approach to clustering clinical trials with similar eligibility features. This approach can be potentially useful for investigating knowledge reuse patterns in clinical trial eligibility criteria designs and for improving clinical trial recruitment. We also contribute an effective crowdsourcing method for evaluating informatics interventions.

SUBMITTER: Hao T

PROVIDER: S-EPMC4119097 | biostudies-literature | 2014 Dec

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Clustering clinical trials with similar eligibility criteria features.

Hao Tianyong T Rusanov Alexander A Boland Mary Regina MR Weng Chunhua C

Journal of biomedical informatics 20140201

<h4>Objectives</h4>To automatically identify and cluster clinical trials with similar eligibility features.<h4>Methods</h4>Using the public repository ClinicalTrials.gov as the data source, we extracted semantic features from the eligibility criteria text of all clinical trials and constructed a trial-feature matrix. We calculated the pairwise similarities for all clinical trials based on their eligibility features. For all trials, by selecting one trial as the center each time, we identified tr ...[more]

PMID: 24496068

Similar Datasets

Project description:ImportanceEligibility criteria for randomized clinical trials (RCTs) are designed to select clinically relevant patient populations. However, not all eligibility criteria are strongly justified, potentially excluding marginalized groups, and limiting the generalizability of trial findings.ObjectiveTo summarize and evaluate the justification of exclusion criteria in published RCTs in critical care medicine.Evidence reviewA systematic sampling review of parallel-group RCTs published in the top 5 general internal medicine journals by impact factor (The Lancet, New England Journal of Medicine, Journal of the American Medical Association, British Medical Journal, and Annals of Internal Medicine) between January 1, 2018, and February 23, 2023, was conducted. RCTs enrolling adults in intensive care units (ICUs) and RCTs enrolling critically ill patients who required life-sustaining interventions typically initiated in the ICU were included. All study exclusion criteria were categorized as either poorly justified, potentially justified, or strongly justified, adapting previously established criteria, independently and in duplicate.FindingsIn total, 225 studies were identified, 75 of which were included. The median (IQR) number of exclusion criteria per trial was 19 (14-24), with 1455 total exclusion criteria. Common exclusion criteria were related to the risk of adverse reaction to interventions (302 criteria [20.8%]), followed by inability to obtain consent (120 criteria [8.2%]), and treatment limitation decisions (97 criteria [6.7%]). Most exclusion criteria were either strongly justified (1080 criteria [74.2%]) or potentially justified (297 criteria [20.4%]), whereas 5.4% (78 criteria) were poorly justified. Of the 78 poorly justified exclusion criteria, the most common were pregnancy (19 criteria [24.4%]), communication barriers (11 criteria [14.1%]), lactation (10 criteria [12.8%]), and lack of health insurance (10 criteria [12.8%]). Overall, 45 of 75 studies (60.0%) had at least 1 poorly justified exclusion criteria.Conclusions and relevanceMost exclusion criteria in critical care medicine RCTs were strongly justifiable. Across poorly justified criteria, the most common exclusions were pregnant or lactating persons, those with communication barriers, and individuals without health insurance. This highlights the need to carefully consider exclusion criteria when designing trials to minimize the inappropriate exclusion of participants and enhance generalizability.

Project description:BackgroundBreast cancer (BC) is the most common cancer type in women. The purpose of this study was to assess the eligibility criteria in recent clinical trials in BC, especially those that can limit the enrollment of older patients as well as those with comorbidities and poor performance status.MethodsData on clinical trials in BC were extracted from ClinicalTrials.gov. Co-primary outcomes were proportions of trials with different types of the eligibility criteria. Associations between trial characteristics and the presence of certain types of these criteria (binary variable) were determined with univariate and multivariate logistic regression.ResultsOur analysis included 522 trials of systemic anticancer treatments started between 2020 and 2022. Upper age limits, strict exclusion criteria pertaining to comorbidities, and those referring to inadequate performance status of the patient were used in 204 (39%), 404 (77%), and 360 (69%) trials, respectively. Overall, 493 trials (94%) had at least one of these criteria. The odds of the presence of each type of the exclusion criteria were significantly associated with investigational site location and trial phase. We also showed that the odds of the upper age limits and the exclusion criteria involving the performance status were significantly higher in the cohort of recent trials compared with cohort of 309 trials started between 2010 and 2012 (39% vs 19% and 69% vs 46%, respectively; p < 0.001 for univariate and multivariate analysis in both comparisons). The proportion of trials with strict exclusion criteria was comparable between the two cohorts (p > 0.05). Only three of recent trials (1%) enrolled solely patients aged 65 or 70 and older.ConclusionsMany recent clinical trials in BC exclude large groups of patients, especially older adults, individuals with different comorbidities, and those with poor performance status. Careful modification of some of the eligibility criteria in these trials should be considered to allow investigators to assess the benefits and harms of investigational treatments in participants with characteristics typically encountered in clinical practice.

Dataset Information

Clustering clinical trials with similar eligibility criteria features.

Publications

Clustering clinical trials with similar eligibility criteria features.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets