Unknown

Dataset Information

0

GTX.Digest.VCF: an online NGS data interpretation system based on intelligent gene ranking and large-scale text mining.


ABSTRACT: BACKGROUND:An important task in the interpretation of sequencing data is to highlight pathogenic genes (or detrimental variants) in the field of Mendelian diseases. It is still challenging despite the recent rapid development of genomics and bioinformatics. A typical interpretation workflow includes annotation, filtration, manual inspection and literature review. Those steps are time-consuming and error-prone in the absence of systematic support. Therefore, we developed GTX.Digest.VCF, an online DNA sequencing interpretation system, which prioritizes genes and variants for novel disease-gene relation discovery and integrates text mining results to provide literature evidence for the discovery. Its phenotype-driven ranking and biological data mining approach significantly speed up the whole interpretation process. RESULTS:The GTX.Digest.VCF system is freely available as a web portal at http://vcf.gtxlab.com for academic research. Evaluation on the DDD project dataset demonstrates an accuracy of 77% (235 out of 305 cases) for top-50 genes and an accuracy of 41.6% (127 out of 305 cases) for top-5 genes. CONCLUSIONS:GTX.Digest.VCF provides an intelligent web portal for genomics data interpretation via the integration of bioinformatics tools, distributed parallel computing, biomedical text mining. It can facilitate the application of genomic analytics in clinical research and practices.

SUBMITTER: Jiang Y 

PROVIDER: S-EPMC6923899 | biostudies-literature | 2019 Dec

REPOSITORIES: biostudies-literature

altmetric image

Publications

GTX.Digest.VCF: an online NGS data interpretation system based on intelligent gene ranking and large-scale text mining.

Jiang Yanhuang Y   Wu Chengkun C   Zhang Yanghui Y   Zhang Shaowei S   Yu Shuojun S   Lei Peng P   Lu Qin Q   Xi Yanwei Y   Wang Hua H   Song Zhuo Z  

BMC medical genomics 20191220 Suppl 8


<h4>Background</h4>An important task in the interpretation of sequencing data is to highlight pathogenic genes (or detrimental variants) in the field of Mendelian diseases. It is still challenging despite the recent rapid development of genomics and bioinformatics. A typical interpretation workflow includes annotation, filtration, manual inspection and literature review. Those steps are time-consuming and error-prone in the absence of systematic support. Therefore, we developed GTX.Digest.VCF, a  ...[more]

Similar Datasets

| S-EPMC3228855 | biostudies-literature
| S-EPMC4290788 | biostudies-literature
| S-EPMC2276192 | biostudies-other
| S-EPMC7426798 | biostudies-literature
| S-EPMC7083782 | biostudies-literature
| S-EPMC3475109 | biostudies-literature
| S-EPMC3694679 | biostudies-literature
| S-EPMC4583433 | biostudies-literature
| S-EPMC4830514 | biostudies-literature