Project description:ObjectiveTo conduct a systematic scoping review of explainable artificial intelligence (XAI) models that use real-world electronic health record data, categorize these techniques according to different biomedical applications, identify gaps of current studies, and suggest future research directions.Materials and methodsWe searched MEDLINE, IEEE Xplore, and the Association for Computing Machinery (ACM) Digital Library to identify relevant papers published between January 1, 2009 and May 1, 2019. We summarized these studies based on the year of publication, prediction tasks, machine learning algorithm, dataset(s) used to build the models, the scope, category, and evaluation of the XAI methods. We further assessed the reproducibility of the studies in terms of the availability of data and code and discussed open issues and challenges.ResultsForty-two articles were included in this review. We reported the research trend and most-studied diseases. We grouped XAI methods into 5 categories: knowledge distillation and rule extraction (N = 13), intrinsically interpretable models (N = 9), data dimensionality reduction (N = 8), attention mechanism (N = 7), and feature interaction and importance (N = 5).DiscussionXAI evaluation is an open issue that requires a deeper focus in the case of medical applications. We also discuss the importance of reproducibility of research work in this field, as well as the challenges and opportunities of XAI from 2 medical professionals' point of view.ConclusionBased on our review, we found that XAI evaluation in medicine has not been adequately and formally practiced. Reproducibility remains a critical concern. Ample opportunities exist to advance XAI research in medicine.
Project description:In prior work, Friends of Cancer Research convened multiple data partners to establish standardized definitions for oncology real-world end points derived from electronic health records (EHRs) and claims data. Here, we assessed the performance of real-world overall survival (rwOS) from data sets sourced from EHRs by evaluating the ability of the end point to reflect expected differences from a previous randomized controlled trial across five data sources, after applying inclusion/exclusion criteria. The KEYNOTE-189 clinical trial protocol of platinum doublet chemotherapy (chemotherapy) vs. programmed cell death protein 1 (PD-1) in combination with platinum doublet chemotherapy (PD-1 combination) in first-line nonsquamous metastatic non-small cell lung cancer guided retrospective cohort selection. The Kaplan-Meier product limit estimator was used to calculate 12-month rwOS with 95% confidence intervals (CIs) in each data source. Cox proportional hazards models estimated hazard ratios (HRs) and associated 95% CIs, controlled for prognostic factors. Once the inclusion/exclusion criteria were applied, the five resulting data sets included 155 to 1,501 patients in the chemotherapy cohort and 36 to 405 patients in the PD-1 combination cohort. Twelve-month rwOS ranged from 45% to 58% in the chemotherapy cohort and 44% to 68% in the PD-1 combination cohort. The adjusted HR for death ranged from 0.80 (95% CI: 0.69, 0.93) to 1.15 (95% CI: 0.71, 1.85), controlling for age, gender, performance status, and smoking status. This study yielded insights regarding data capture, including ability of real-world data to precisely identify patient populations and the impact of criteria on end points. Sensitivity analyses could elucidate data set-specific factors that drive results.
Project description:IntroductionWe tested the ability of our natural language processing (NLP) algorithm to identify delirium episodes in a large-scale study using real-world clinical notes.MethodsWe used the Rochester Epidemiology Project to identify persons ≥ 65 years who were hospitalized between 2011 and 2017. We identified all persons with an International Classification of Diseases code for delirium within ±14 days of a hospitalization. We independently applied our NLP algorithm to all clinical notes for this same population. We calculated rates using number of delirium episodes as the numerator and number of hospitalizations as the denominator. Rates were estimated overall, by demographic characteristics, and by year of episode, and differences were tested using Poisson regression.ResultsIn total, 14,255 persons had 37,554 hospitalizations between 2011 and 2017. The code-based delirium rate was 3.02 per 100 hospitalizations (95% CI: 2.85, 3.20). The NLP-based rate was 7.36 per 100 (95% CI: 7.09, 7.64). Rates increased with age (both p < 0.0001). Code-based rates were higher in men compared to women (p = 0.03), but NLP-based rates were similar by sex (p = 0.89). Code-based rates were similar by race and ethnicity, but NLP-based rates were higher in the White population compared to the Black and Asian populations (p = 0.001). Both types of rates increased significantly over time (both p values < 0.001).ConclusionsThe NLP algorithm identified more delirium episodes compared to the ICD code method. However, NLP may still underestimate delirium cases because of limitations in real-world clinical notes, including incomplete documentation, practice changes over time, and missing clinical notes in some time periods.
Project description:Meaningful real-world evidence (RWE) generation requires unstructured data found in electronic health records (EHRs) which are often missing from administrative claims; however, obtaining relevant data from unstructured EHR sources is resource-intensive. In response, researchers are using natural language processing (NLP) with machine learning (ML) techniques (i.e., ML extraction) to extract real-world data (RWD) at scale. This study assessed the quality and fitness-for-use of EHR-derived oncology data curated using NLP with ML as compared to the reference standard of expert abstraction. Using a sample of 186,313 patients with lung cancer from a nationwide EHR-derived de-identified database, we performed a series of replication analyses demonstrating some common analyses conducted in retrospective observational research with complex EHR-derived data to generate evidence. Eligible patients were selected into biomarker- and treatment-defined cohorts, first with expert-abstracted then with ML-extracted data. We utilized the biomarker- and treatment-defined cohorts to perform analyses related to biomarker-associated survival and treatment comparative effectiveness, respectively. Across all analyses, the results differed by less than 8% between the data curation methods, and similar conclusions were reached. These results highlight that high-performance ML-extracted variables trained on expert-abstracted data can achieve similar results as when using abstracted data, unlocking the ability to perform oncology research at scale.
Project description:Several reviews and case reports have described how information derived from the analysis of genomes are currently included in electronic health records (EHRs) for the purposes of supporting clinical decisions. Since the introduction of this new type of information in EHRs is relatively new (for instance, the widespread adoption of EHRs in the United States is just about a decade old), it is not surprising that a myriad of approaches has been attempted, with various degrees of success. EHR systems undergo much customization to fit the needs of health systems; these approaches have been varied and not always generalizable. The intent of this article is to present a high-level view of these approaches, emphasizing the functionality that they are trying to achieve, and not to advocate for specific solutions, which may become obsolete soon after this review is published. We start by broadly defining the end goal of including genomics in EHRs for healthcare and then explaining the various sources of information that need to be linked to arrive at a clinically actionable genomics analysis using a pharmacogenomics example. In addition, we include discussions on open issues and a vision for the next generation systems that integrate whole genome sequencing and EHRs in a seamless fashion.
Project description:Antimalarials (AMs) reduce disease activity and improve survival in patients with systemic lupus erythematosus (SLE), but studies have reported low AM prescribing frequencies. Using a real-world electronic health record cohort, we examined if patient or provider characteristics impacted AM prescribing. We identified 977 SLE cases, 94% of whom were ever prescribed an AM. Older patients and patients with SLE nephritis were less likely to be on AMs. Current age (odds ratio = 0.97, p < 0.01) and nephritis (odds ratio = 0.16, p < 0.01) were both significantly associated with ever AM use after adjustment for sex and race. Of the 244 SLE nephritis cases, only 63% were currently on AMs. SLE nephritis subjects who were currently prescribed AMs were more likely to be followed by a rheumatologist than a nephrologist and less likely to have undergone dialysis or renal transplant (both p < 0.001). Non-current versus current SLE nephritis AM users had higher serum creatinine (p < 0.001), higher urine protein (p = 0.05), and lower hemoglobin levels (p < 0.01). As AMs reduce disease damage and improve survival in patients with SLE, our results demonstrate an opportunity to target future efforts to improve prescribing rates among multi-specialty providers.
Project description:ObjectivesExamine whether data from early access to medicines in the USA can be used to inform National Institute for Health and Care Excellence (NICE) health technology assessments (HTA) in oncology.DesignRetrospective cohort study.SettingOncology-based community and academic treatment centres in the USA.ParticipantsPatients present in a nationwide electronic health record (EHR)-derived deidentified database.InterventionsCancer drugs that underwent NICE technology appraisal (TA) between 2014 and 2019.Primary and secondary outcome measuresThe count and follow-up time of US patients, available in the EHR, who were exposed to cancer drugs of interest in the period between Food and Drug Administration (FDA) approval and dates relevant to the NICE appraisal process.ResultsIn 59 of 60 TAs analysed, the cancer therapy was approved in the USA before the final appraisal by NICE. The median time from FDA approval to the publication of NICE recommendations was 18.5 months, at which time the US EHR-derived database had, on average, 269 patients (SD=356) exposed to the new therapy, with a median of 75.3 person-years (IQR: 13.1-173) in time-at-risk. A case study generated evidence on real-world overall survival and treatment duration.ConclusionsAcross different cancer therapies, there was substantial variability in US real-world data accumulated between FDA approval and NICE decision milestones. The applicability of these data to generate evidence for HTA decision-making should be assessed on a case-by-case basis depending on the intended HTA use case.
Project description:IntroductionThe objective of this study is to determine the extent and describe the nature of patient-generated health data (PGHD) integration into electronic health records (EHRs) using systematic scoping methods to review the available literature. PGHD have the potential to enhance decision making by providing the valuable information that may not be ordinarily captured during a routine care visit. These data which are captured from mobile devices, such as smartphones, activity trackers and other sensors, should be integrated into clinical workflows to allow for optimal use by clinicians.Methods and analysisThis study aims to conduct a rigorous scoping review to explore evidence related to the integration of PGHD into EHRs. Using the framework developed by Arksey and O'Malley, we will create a systematic search strategy, chart data from the relevant articles, and use a qualitative, thematic approach to analyse the data. This review will enable the identification of types of integration and describe challenges and barriers to integrating PGHD.Ethics and disseminationDatabase searches will be initiated in June 2019. The review is expected to be completed by October 2019. As the content of the full-text articles emerges, the authors will summarise the characteristics related to the integration of PGHD. The findings of this scoping review will identify research gaps and present implications for future research.
Project description:Electronic health record (EHR)-derived real-world data (RWD) can be sourced to create external comparator cohorts to oncology clinical trials. This exploratory study assessed whether EHR-derived patient cohorts could emulate select clinical trial control arms across multiple tumor types. The impact of analytic decisions on emulation results was also evaluated. By digitizing Kaplan-Meier curves, we reconstructed published control arm results from 15 trials that supported drug approvals from January 1, 2016, to April 30, 2018. RWD cohorts were constructed using a nationwide EHR-derived de-identified database by aligning eligibility criteria and weighting to trial baseline characteristics. Trial data and RWD cohorts were compared using Kaplan-Meier and Cox proportional hazards regression models for progression-free survival (PFS) and overall survival (OS; individual cohorts) and multitumor random effects models of hazard ratios (HRs) for median endpoint correlations (across cohorts). Post hoc, the impact of specific analytic decisions on endpoints was assessed using a case study. Comparing trial data and weighted RWD cohorts, PFS results were more similar (HR range = 0.63-1.18, pooled HR = 0.84, correlation of median = 0.91) compared to OS (HR range = 0.36-1.09, pooled HR = 0.76, correlation of median = 0.85). OS HRs were more variable and trended toward worse for RWD cohorts. The post hoc case study had OS HR ranging from 0.67 (95% confidence interval (CI): 0.56-0.79) to 0.92 (95% CI: 0.78-1.09) depending on specific analytic decisions. EHR-derived RWD can emulate oncology clinical trial control arm results, although with variability. Visibility into clinical trial cohort characteristics may shape and refine analytic approaches.
Project description:BackgroundLarge medical centers in urban areas, like Los Angeles, care for a diverse patient population and offer the potential to study the interplay between genetic ancestry and social determinants of health. Here, we explore the implications of genetic ancestry within the University of California, Los Angeles (UCLA) ATLAS Community Health Initiative-an ancestrally diverse biobank of genomic data linked with de-identified electronic health records (EHRs) of UCLA Health patients (N=36,736).MethodsWe quantify the extensive continental and subcontinental genetic diversity within the ATLAS data through principal component analysis, identity-by-descent, and genetic admixture. We assess the relationship between genetically inferred ancestry (GIA) and >1500 EHR-derived phenotypes (phecodes). Finally, we demonstrate the utility of genetic data linked with EHR to perform ancestry-specific and multi-ancestry genome and phenome-wide scans across a broad set of disease phenotypes.ResultsWe identify 5 continental-scale GIA clusters including European American (EA), African American (AA), Hispanic Latino American (HL), South Asian American (SAA) and East Asian American (EAA) individuals and 7 subcontinental GIA clusters within the EAA GIA corresponding to Chinese American, Vietnamese American, and Japanese American individuals. Although we broadly find that self-identified race/ethnicity (SIRE) is highly correlated with GIA, we still observe marked differences between the two, emphasizing that the populations defined by these two criteria are not analogous. We find a total of 259 significant associations between continental GIA and phecodes even after accounting for individuals' SIRE, demonstrating that for some phenotypes, GIA provides information not already captured by SIRE. GWAS identifies significant associations for liver disease in the 22q13.31 locus across the HL and EAA GIA groups (HL p-value=2.32×10-16, EAA p-value=6.73×10-11). A subsequent PheWAS at the top SNP reveals significant associations with neurologic and neoplastic phenotypes specifically within the HL GIA group.ConclusionsOverall, our results explore the interplay between SIRE and GIA within a disease context and underscore the utility of studying the genomes of diverse individuals through biobank-scale genotyping linked with EHR-based phenotyping.