Unknown

Dataset Information

0

RegBase: whole genome base-wise aggregation and functional prediction for human non-coding regulatory variants.


ABSTRACT: Predicting the functional or pathogenic regulatory variants in the human non-coding genome facilitates the interpretation of disease causation. While numerous prediction methods are available, their performance is inconsistent or restricted to specific tasks, which raises the demand of developing comprehensive integration for those methods. Here, we compile whole genome base-wise aggregations, regBase, that incorporate largest prediction scores. Building on different assumptions of causality, we train three composite models to score functional, pathogenic and cancer driver non-coding regulatory variants respectively. We demonstrate the superior and stable performance of our models using independent benchmarks and show great success to fine-map causal regulatory variants on specific locus or at base-wise resolution. We believe that regBase database together with three composite models will be useful in different areas of human genetic studies, such as annotation-based casual variant fine-mapping, pathogenic variant discovery as well as cancer driver mutation identification. regBase is freely available at https://github.com/mulinlab/regBase.

SUBMITTER: Zhang S 

PROVIDER: S-EPMC6868349 | biostudies-literature | 2019 Dec

REPOSITORIES: biostudies-literature

altmetric image

Publications

regBase: whole genome base-wise aggregation and functional prediction for human non-coding regulatory variants.

Zhang Shijie S   He Yukun Y   Liu Huanhuan H   Zhai Haoyu H   Huang Dandan D   Yi Xianfu X   Dong Xiaobao X   Wang Zhao Z   Zhao Ke K   Zhou Yao Y   Wang Jianhua J   Yao Hongcheng H   Xu Hang H   Yang Zhenglu Z   Sham Pak Chung PC   Chen Kexin K   Li Mulin Jun MJ  

Nucleic acids research 20191201 21


Predicting the functional or pathogenic regulatory variants in the human non-coding genome facilitates the interpretation of disease causation. While numerous prediction methods are available, their performance is inconsistent or restricted to specific tasks, which raises the demand of developing comprehensive integration for those methods. Here, we compile whole genome base-wise aggregations, regBase, that incorporate largest prediction scores. Building on different assumptions of causality, we  ...[more]

Similar Datasets

| S-EPMC6647327 | biostudies-literature
| S-EPMC7247590 | biostudies-literature
| S-EPMC7575052 | biostudies-literature
| S-EPMC8934622 | biostudies-literature
| S-EPMC10645545 | biostudies-literature
| S-EPMC5319793 | biostudies-literature
| S-EPMC5892192 | biostudies-literature
| S-EPMC7745885 | biostudies-literature
| S-EPMC6802215 | biostudies-literature
| S-EPMC4773290 | biostudies-literature