Dataset Information

An argument for reporting data standardization procedures in multi-site predictive modeling: case study on the impact of LOINC standardization on model performance.

ABSTRACT:

Objectives

We aimed to gain a better understanding of how standardization of laboratory data can impact predictive model performance in multi-site datasets. We hypothesized that standardizing local laboratory codes to logical observation identifiers names and codes (LOINC) would produce predictive models that significantly outperform those learned utilizing local laboratory codes.

Materials and methods

We predicted 30-day hospital readmission for a set of heart failure-specific visits to 13 hospitals from 2008 to 2012. Laboratory test results were extracted and then manually cleaned and mapped to LOINC. We extracted features to summarize laboratory data for each patient and used a training dataset (2008-2011) to learn models using a variety of feature selection techniques and classifiers. We evaluated our hypothesis by comparing model performance on an independent test dataset (2012).

Results

Models that utilized LOINC performed significantly better than models that utilized local laboratory test codes, regardless of the feature selection technique and classifier approach used.

Discussion and conclusion

We quantitatively demonstrated the positive impact of standardizing multi-site laboratory data to LOINC prior to use in predictive models. We used our findings to argue for the need for detailed reporting of data standardization procedures in predictive modeling, especially in studies leveraging multi-site datasets extracted from electronic health records.

SUBMITTER: Barda AJ

PROVIDER: S-EPMC6435008 | biostudies-literature |

REPOSITORIES: biostudies-literature

ACCESS DATA

Similar Datasets

Project description:BackgroundScreening for eligible patients continues to pose a great challenge for many clinical trials. This has led to a rapidly growing interest in standardizing computable representations of eligibility criteria (EC) in order to develop tools that leverage data from electronic health record (EHR) systems. Although laboratory procedures (LP) represent a common entity of EC that is readily available and retrievable from EHR systems, there is a lack of interoperable data models for this entity of EC. A public, specialized data model that utilizes international, widely-adopted terminology for LP, e.g. Logical Observation Identifiers Names and Codes (LOINC®), is much needed to support automated screening tools.ObjectiveThe aim of this study is to establish a core dataset for LP most frequently requested to recruit patients for clinical trials using LOINC terminology. Employing such a core dataset could enhance the interface between study feasibility platforms and EHR systems and significantly improve automatic patient recruitment.MethodsWe used a semi-automated approach to analyze 10,516 screening forms from the Medical Data Models (MDM) portal's data repository that are pre-annotated with Unified Medical Language System (UMLS). An automated semantic analysis based on concept frequency is followed by an extensive manual expert review performed by physicians to analyze complex recruitment-relevant concepts not amenable to automatic approach.ResultsBased on analysis of 138,225 EC from 10,516 screening forms, 55 laboratory procedures represented 77.87% of all UMLS laboratory concept occurrences identified in the selected EC forms. We identified 26,413 unique UMLS concepts from 118 UMLS semantic types and covered the vast majority of Medical Subject Headings (MeSH) disease domains.ConclusionsOnly a small set of common LP covers the majority of laboratory concepts in screening EC forms which supports the feasibility of establishing a focused core dataset for LP. We present ELaPro, a novel, LOINC-mapped, core dataset for the most frequent 55 LP requested in screening for clinical trials. ELaPro is available in multiple machine-readable data formats like CSV, ODM and HL7 FHIR. The extensive manual curation of this large number of free-text EC as well as the combining of UMLS and LOINC terminologies distinguishes this specialized dataset from previous relevant datasets in the literature.

Project description:BACKGROUND:Evidence-based health care is informed by results of randomized clinical trials (RCTs) and their syntheses in meta-analyses. When the trial outcomes measured are not clearly described in trial publications, knowledge synthesis, translation, and decision-making may be impeded. While heterogeneity in outcomes measured in adolescent major depressive disorder (MDD) RCTs has been described, the comprehensiveness of outcome reporting is unknown. This study aimed to assess the reporting of primary outcomes in RCTs evaluating treatments for adolescent MDD. METHODS:RCTs evaluating treatment interventions in adolescents with a diagnosis of MDD published between 2008 and 2017 specifying a single primary outcome were eligible for outcome reporting assessment. Outcome reporting assessment was done independently in duplicate using a comprehensive checklist of 58 reporting items. Primary outcome information provided in each RCT publication was scored as "fully reported", "partially reported", or "not reported" for each checklist item, as applicable. RESULTS:Eighteen of 42 identified articles were found to have a discernable single primary outcome and were included for outcome reporting assessment. Most trials (72%) did not fully report on over half of the 58 checklist items. Items describing masking of outcome assessors, timing and frequency of outcome assessment, and outcome analyses were fully reported in over 70% of trials. Items less frequently reported included outcome measurement instrument properties (ranging from 6 to 17%), justification of timing and frequency of outcome assessment (6%), and justification of criteria used for clinically significant differences (17%). The overall comprehensiveness of reporting appeared stable over time. CONCLUSIONS:Heterogeneous reporting exists in published adolescent MDD RCTs, with frequent omissions of key details about their primary outcomes. These omissions may impair interpretability, replicability, and synthesis of RCTs that inform clinical guidelines and decision-making in this field. Consensus on the minimal criteria for outcome reporting in adolescent MDD RCTs is needed.

Project description:BACKGROUND:The T1 Mapping and Extracellular volume (ECV) Standardization (T1MES) program explored T1 mapping quality assurance using a purpose-developed phantom with Food and Drug Administration (FDA) and Conformité Européenne (CE) regulatory clearance. We report T1 measurement repeatability across centers describing sequence, magnet, and vendor performance. METHODS:Phantoms batch-manufactured in August 2015 underwent 2?years of structural imaging, B0 and B1, and "reference" slow T1 testing. Temperature dependency was evaluated by the United States National Institute of Standards and Technology and by the German Physikalisch-Technische Bundesanstalt. Center-specific T1 mapping repeatability (maximum one scan per week to minimum one per quarter year) was assessed over mean 358 (maximum 1161) days on 34 1.5?T and 22 3?T magnets using multiple T1 mapping sequences. Image and temperature data were analyzed semi-automatically. Repeatability of serial T1 was evaluated in terms of coefficient of variation (CoV), and linear mixed models were constructed to study the interplay of some of the known sources of T1 variation. RESULTS:Over 2?years, phantom gel integrity remained intact (no rips/tears), B0 and B1 homogenous, and "reference" T1 stable compared to baseline (% change at 1.5?T, 1.95?±?1.39%; 3?T, 2.22?±?1.44%). Per degrees Celsius, 1.5?T, T1 (MOLLI 5s(3s)3s) increased by 11.4?ms in long native blood tubes and decreased by 1.2?ms in short post-contrast myocardium tubes. Agreement of estimated T1 times with "reference" T1 was similar across Siemens and Philips CMR systems at both field strengths (adjusted R2 ranges for both field strengths, 0.99-1.00). Over 1?year, many 1.5?T and 3?T sequences/magnets were repeatable with mean CoVs <?1 and 2% respectively. Repeatability was narrower for 1.5?T over 3?T. Within T1MES repeatability for native T1 was narrow for several sequences, for example, at 1.5?T, Siemens MOLLI 5s(3s)3s prototype number 448B (mean CoV?=?0.27%) and Philips modified Look-Locker inversion recovery (MOLLI) 3s(3s)5s (CoV 0.54%), and at 3?T, Philips MOLLI 3b(3s)5b (CoV 0.33%) and Siemens shortened MOLLI (ShMOLLI) prototype 780C (CoV 0.69%). After adjusting for temperature and field strength, it was found that the T1 mapping sequence and scanner software version (both P?<?0.001 at 1.5?T and 3?T), and to a lesser extent the scanner model (P?=?0.011, 1.5?T only), had the greatest influence on T1 across multiple centers. CONCLUSION:The T1MES CE/FDA approved phantom is a robust quality assurance device. In a multi-center setting, T1 mapping had performance differences between field strengths, sequences, scanner software versions, and manufacturers. However, several specific combinations of field strength, sequence, and scanner are highly repeatable, and thus, have potential to provide standardized assessment of T1 times for clinical use, although temperature correction is required for native T1 tubes at least.