Unknown

Dataset Information

0

A longitudinal analysis of data quality in a large pediatric data research network.


ABSTRACT: Objective:PEDSnet is a clinical data research network (CDRN) that aggregates electronic health record data from multiple children's hospitals to enable large-scale research. Assessing data quality to ensure suitability for conducting research is a key requirement in PEDSnet. This study presents a range of data quality issues identified over a period of 18 months and interprets them to evaluate the research capacity of PEDSnet. Materials and Methods:Results were generated by a semiautomated data quality assessment workflow. Two investigators reviewed programmatic data quality issues and conducted discussions with the data partners' extract-transform-load analysts to determine the cause for each issue. Results:The results include a longitudinal summary of 2182 data quality issues identified across 9 data submission cycles. The metadata from the most recent cycle includes annotations for 850 issues: most frequent types, including missing data (>300) and outliers (>100); most complex domains, including medications (>160) and lab measurements (>140); and primary causes, including source data characteristics (83%) and extract-transform-load errors (9%). Discussion:The longitudinal findings demonstrate the network's evolution from identifying difficulties with aligning the data to a common data model to learning norms in clinical pediatrics and determining research capability. Conclusion:While data quality is recognized as a critical aspect in establishing and utilizing a CDRN, the findings from data quality assessments are largely unpublished. This paper presents a real-world account of studying and interpreting data quality findings in a pediatric CDRN, and the lessons learned could be used by other CDRNs.

SUBMITTER: Khare R 

PROVIDER: S-EPMC6259665 | biostudies-literature | 2017 Nov

REPOSITORIES: biostudies-literature

altmetric image

Publications

A longitudinal analysis of data quality in a large pediatric data research network.

Khare Ritu R   Utidjian Levon L   Ruth Byron J BJ   Kahn Michael G MG   Burrows Evanette E   Marsolo Keith K   Patibandla Nandan N   Razzaghi Hanieh H   Colvin Ryan R   Ranade Daksha D   Kitzmiller Melody M   Eckrich Daniel D   Bailey L Charles LC  

Journal of the American Medical Informatics Association : JAMIA 20171101 6


<h4>Objective</h4>PEDSnet is a clinical data research network (CDRN) that aggregates electronic health record data from multiple children's hospitals to enable large-scale research. Assessing data quality to ensure suitability for conducting research is a key requirement in PEDSnet. This study presents a range of data quality issues identified over a period of 18 months and interprets them to evaluate the research capacity of PEDSnet.<h4>Materials and methods</h4>Results were generated by a semi  ...[more]

Similar Datasets

| S-EPMC6676917 | biostudies-literature
| S-EPMC3165456 | biostudies-other
| S-EPMC4750242 | biostudies-literature
| S-EPMC10308424 | biostudies-literature
| S-EPMC3243635 | biostudies-literature
| S-EPMC3320898 | biostudies-literature
| S-EPMC3931556 | biostudies-literature
| S-EPMC9556519 | biostudies-literature
| S-EPMC8129477 | biostudies-literature
| S-EPMC8039553 | biostudies-literature