Unknown

Dataset Information

0

NCBoost classifies pathogenic non-coding variants in Mendelian diseases through supervised learning on purifying selection signals in humans.


ABSTRACT: State-of-the-art methods assessing pathogenic non-coding variants have mostly been characterized on common disease-associated polymorphisms, yet with modest accuracy and strong positional biases. In this study, we curated 737 high-confidence pathogenic non-coding variants associated with monogenic Mendelian diseases. In addition to interspecies conservation, a comprehensive set of recent and ongoing purifying selection signals in humans is explored, accounting for lineage-specific regulatory elements. Supervised learning using gradient tree boosting on such features achieves a high predictive performance and overcomes positional bias. NCBoost performs consistently across diverse learning and independent testing data sets and outperforms other existing reference methods.

SUBMITTER: Caron B 

PROVIDER: S-EPMC6371618 | biostudies-literature | 2019 Feb

REPOSITORIES: biostudies-literature

altmetric image

Publications

NCBoost classifies pathogenic non-coding variants in Mendelian diseases through supervised learning on purifying selection signals in humans.

Caron Barthélémy B   Luo Yufei Y   Rausell Antonio A  

Genome biology 20190211 1


State-of-the-art methods assessing pathogenic non-coding variants have mostly been characterized on common disease-associated polymorphisms, yet with modest accuracy and strong positional biases. In this study, we curated 737 high-confidence pathogenic non-coding variants associated with monogenic Mendelian diseases. In addition to interspecies conservation, a comprehensive set of recent and ongoing purifying selection signals in humans is explored, accounting for lineage-specific regulatory ele  ...[more]

Similar Datasets

| S-EPMC7593775 | biostudies-literature
| S-EPMC4891680 | biostudies-literature
| S-EPMC4906602 | biostudies-other
| S-EPMC4104271 | biostudies-literature
2021-08-27 | GSE182887 | GEO
| S-EPMC10195113 | biostudies-literature
| S-EPMC10120674 | biostudies-literature
| S-EPMC4111549 | biostudies-literature
| S-EPMC4259975 | biostudies-literature
| S-EPMC7566908 | biostudies-literature