Unknown

Dataset Information

0

GENETEX-a GENomics Report TEXt mining R package and Shiny application designed to capture real-world clinico-genomic data.


ABSTRACT:

Objectives

Clinico-genomic data (CGD) acquired through routine clinical practice has the potential to improve our understanding of clinical oncology. However, these data often reside in heterogeneous and semistructured data, resulting in prolonged time-to-analyses.

Materials and methods

We created GENETEX: an R package and Shiny application for text mining genomic reports from electronic health record (EHR) and direct import into Research Electronic Data Capture (REDCap).

Results

GENETEX facilitates the abstraction of CGD from EHR and streamlines the capture of structured data into REDCap. Its functions include natural language processing of key genomic information, transformation of semistructured data into structured data, and importation into REDCap. When evaluated with manual abstraction, GENETEX had >99% agreement and captured CGD in approximately one-fifth the time.

Conclusions

GENETEX is freely available under the Massachusetts Institute of Technology license and can be obtained from GitHub (https://github.com/TheMillerLab/genetex). GENETEX is executed in R and deployed as a Shiny application for non-R users. It produces high-fidelity abstraction of CGD in a fraction of the time.

SUBMITTER: Miller DM 

PROVIDER: S-EPMC8476929 | biostudies-literature |

REPOSITORIES: biostudies-literature

Similar Datasets

| S-EPMC8696973 | biostudies-literature
| S-EPMC9657798 | biostudies-literature
| S-EPMC10822582 | biostudies-literature
| S-EPMC4262416 | biostudies-literature
| S-EPMC4394611 | biostudies-literature
| S-EPMC7484987 | biostudies-literature
| S-EPMC7078066 | biostudies-literature
| S-EPMC10072205 | biostudies-literature
| S-EPMC10538253 | biostudies-literature
| S-EPMC4674139 | biostudies-literature