Unknown

Dataset Information

0

REDBot: Natural language process methods for clinical copy number variation reporting in prenatal and products of conception diagnosis.


ABSTRACT: BACKGROUND:Current copy number variation (CNV) identification methods have rapidly become mature. However, the postdetection processes such as variant interpretation or reporting are inefficient. To overcome this situation, we developed REDBot as an automated software package for accurate and direct generation of clinical diagnostic reports for prenatal and products of conception (POC) samples. METHODS:We applied natural language process (NLP) methods for analyzing 30,235 in-house historical clinical reports through active learning, and then, developed clinical knowledge bases, evidence-based interpretation methods and reporting criteria to support the whole postdetection pipeline. RESULTS:Of the 30,235 reports, we obtained 37,175 CNV-paragraph pairs. For these pairs, the active learning approaches achieved a 0.9466 average F1-score in sentence classification. The overall accuracy for variant classification was 95.7%, 95.2%, and 100.0% in retrospective, prospective, and clinical utility experiments, respectively. CONCLUSION:By integrating NLP methods in CNVs postdetection pipeline, REDBot is a robust and rapid tool with clinical utility for prenatal and POC diagnosis.

SUBMITTER: Liu M 

PROVIDER: S-EPMC7667294 | biostudies-literature | 2020 Nov

REPOSITORIES: biostudies-literature

altmetric image

Publications

REDBot: Natural language process methods for clinical copy number variation reporting in prenatal and products of conception diagnosis.

Liu Mengmeng M   Zhong Yunshan Y   Liu Hongqian H   Liang Desheng D   Liu Erhong E   Zhang Yu Y   Tian Feng F   Liang Qiaowei Q   Cram David S DS   Wang Hua H   Wu Lingqian L   Yu Fuli F  

Molecular genetics & genomic medicine 20200922 11


<h4>Background</h4>Current copy number variation (CNV) identification methods have rapidly become mature. However, the postdetection processes such as variant interpretation or reporting are inefficient. To overcome this situation, we developed REDBot as an automated software package for accurate and direct generation of clinical diagnostic reports for prenatal and products of conception (POC) samples.<h4>Methods</h4>We applied natural language process (NLP) methods for analyzing 30,235 in-house  ...[more]

Similar Datasets

| S-EPMC7812460 | biostudies-literature
| S-EPMC8243933 | biostudies-literature
| S-EPMC8265053 | biostudies-literature
| S-EPMC9329539 | biostudies-literature
2015-07-13 | GSE39332 | GEO
2015-07-13 | E-GEOD-39332 | biostudies-arrayexpress
| S-EPMC8488434 | biostudies-literature
| S-EPMC4411081 | biostudies-literature
| S-EPMC2846957 | biostudies-literature
| S-EPMC5922522 | biostudies-literature