Unknown

Dataset Information

0

Prioritization of retinal disease genes: an integrative approach.


ABSTRACT: The discovery of novel disease-associated variations in genes is often a daunting task in highly heterogeneous disease classes. We seek a generalizable algorithm that integrates multiple publicly available genomic data sources in a machine-learning model for the prioritization of candidates identified in patients with retinal disease. To approach this problem, we generate a set of feature vectors from publicly available microarray, RNA-seq, and ChIP-seq datasets of biological relevance to retinal disease, to observe patterns in gene expression specificity among tissues of the body and the eye, in addition to photoreceptor-specific signals by the CRX transcription factor. Using these features, we describe a novel algorithm, positive and unlabeled learning for prioritization (PULP). This article compares several popular supervised learning techniques as the regression function for PULP. The results demonstrate a highly significant enrichment for previously characterized disease genes using a logistic regression method. Finally, a comparison of PULP with the popular gene prioritization tool ENDEAVOUR shows superior prioritization of retinal disease genes from previous studies. The java source code, compiled binary, assembled feature vectors, and instructions are available online at https://github.com/ahwagner/PULP.

SUBMITTER: Wagner AH 

PROVIDER: S-EPMC4509594 | biostudies-literature | 2013 Jun

REPOSITORIES: biostudies-literature

altmetric image

Publications

Prioritization of retinal disease genes: an integrative approach.

Wagner Alex H AH   Taylor Kyle R KR   DeLuca Adam P AP   Casavant Thomas L TL   Mullins Robert F RF   Stone Edwin M EM   Scheetz Todd E TE   Braun Terry A TA  

Human mutation 20130412 6


The discovery of novel disease-associated variations in genes is often a daunting task in highly heterogeneous disease classes. We seek a generalizable algorithm that integrates multiple publicly available genomic data sources in a machine-learning model for the prioritization of candidates identified in patients with retinal disease. To approach this problem, we generate a set of feature vectors from publicly available microarray, RNA-seq, and ChIP-seq datasets of biological relevance to retina  ...[more]

Similar Datasets

| S-EPMC4500531 | biostudies-literature
| S-EPMC4266634 | biostudies-literature
| S-EPMC7610386 | biostudies-literature
| S-EPMC6050338 | biostudies-literature
| S-EPMC7021177 | biostudies-literature
| S-EPMC8003049 | biostudies-literature
| S-EPMC5581952 | biostudies-literature
| S-EPMC2427257 | biostudies-literature
2021-06-28 | GSE171239 | GEO
| S-EPMC4193716 | biostudies-literature