Unknown

Dataset Information

0

Evolutionary modeling and prediction of non-coding RNAs in Drosophila.


ABSTRACT: We performed benchmarks of phylogenetic grammar-based ncRNA gene prediction, experimenting with eight different models of structural evolution and two different programs for genome alignment. We evaluated our models using alignments of twelve Drosophila genomes. We find that ncRNA prediction performance can vary greatly between different gene predictors and subfamilies of ncRNA gene. Our estimates for false positive rates are based on simulations which preserve local islands of conservation; using these simulations, we predict a higher rate of false positives than previous computational ncRNA screens have reported. Using one of the tested prediction grammars, we provide an updated set of ncRNA predictions for D. melanogaster and compare them to previously-published predictions and experimental data. Many of our predictions show correlations with protein-coding genes. We found significant depletion of intergenic predictions near the 3' end of coding regions and furthermore depletion of predictions in the first intron of protein-coding genes. Some of our predictions are colocated with larger putative unannotated genes: for example, 17 of our predictions showing homology to the RFAM family snoR28 appear in a tandem array on the X chromosome; the 4.5 Kbp spanned by the predicted tandem array is contained within a FlyBase-annotated cDNA.

SUBMITTER: Bradley RK 

PROVIDER: S-EPMC2721679 | biostudies-literature | 2009 Aug

REPOSITORIES: biostudies-literature

altmetric image

Publications

Evolutionary modeling and prediction of non-coding RNAs in Drosophila.

Bradley Robert K RK   Uzilov Andrew V AV   Skinner Mitchell E ME   Bendaña Yuri R YR   Barquist Lars L   Holmes Ian I  

PloS one 20090811 8


We performed benchmarks of phylogenetic grammar-based ncRNA gene prediction, experimenting with eight different models of structural evolution and two different programs for genome alignment. We evaluated our models using alignments of twelve Drosophila genomes. We find that ncRNA prediction performance can vary greatly between different gene predictors and subfamilies of ncRNA gene. Our estimates for false positive rates are based on simulations which preserve local islands of conservation; usi  ...[more]

Similar Datasets

| S-EPMC2662882 | biostudies-literature
| S-EPMC3441527 | biostudies-literature
| S-EPMC2422843 | biostudies-literature
| S-EPMC6261887 | biostudies-literature
| S-EPMC4872081 | biostudies-other
| S-EPMC5282891 | biostudies-literature
| S-EPMC4739325 | biostudies-literature
| S-EPMC3150283 | biostudies-literature
| S-EPMC3827931 | biostudies-literature
| S-EPMC2527527 | biostudies-literature