Unknown

Dataset Information

0

Accurate prediction of NAGNAG alternative splicing.


ABSTRACT: Alternative splicing (AS) involving NAGNAG tandem acceptors is an evolutionarily widespread class of AS. Recent predictions of alternative acceptor usage reported better results for acceptors separated by larger distances, than for NAGNAGs. To improve the latter, we aimed at the use of Bayesian networks (BN), and extensive experimental validation of the predictions. Using carefully constructed training and test datasets, a balanced sensitivity and specificity of >or=92% was achieved. A BN trained on the combined dataset was then used to make predictions, and 81% (38/47) of the experimentally tested predictions were verified. Using a BN learned on human data on six other genomes, we show that while the performance for the vertebrate genomes matches that achieved on human data, there is a slight drop for Drosophila and worm. Lastly, using the prediction accuracy according to experimental validation, we estimate the number of yet undiscovered alternative NAGNAGs. State of the art classifiers can produce highly accurate prediction of AS at NAGNAGs, indicating that we have identified the major features of the 'NAGNAG-splicing code' within the splice site and its immediate neighborhood. Our results suggest that the mechanism behind NAGNAG AS is simple, stochastic, and conserved among vertebrates and beyond.

SUBMITTER: Sinha R 

PROVIDER: S-EPMC2699507 | biostudies-literature | 2009 Jun

REPOSITORIES: biostudies-literature

altmetric image

Publications

Accurate prediction of NAGNAG alternative splicing.

Sinha Rileen R   Nikolajewa Swetlana S   Szafranski Karol K   Hiller Michael M   Jahn Niels N   Huse Klaus K   Platzer Matthias M   Backofen Rolf R  

Nucleic acids research 20090409 11


Alternative splicing (AS) involving NAGNAG tandem acceptors is an evolutionarily widespread class of AS. Recent predictions of alternative acceptor usage reported better results for acceptors separated by larger distances, than for NAGNAGs. To improve the latter, we aimed at the use of Bayesian networks (BN), and extensive experimental validation of the predictions. Using carefully constructed training and test datasets, a balanced sensitivity and specificity of >or=92% was achieved. A BN traine  ...[more]

Similar Datasets

| S-EPMC2951375 | biostudies-literature
| S-EPMC4068082 | biostudies-literature
| S-EPMC1380236 | biostudies-literature
| S-EPMC3095350 | biostudies-other
| S-EPMC2375911 | biostudies-literature
| S-EPMC550657 | biostudies-literature
| S-EPMC10651857 | biostudies-literature
| S-EPMC11368187 | biostudies-literature
| S-EPMC10996483 | biostudies-literature
| S-EPMC5435985 | biostudies-literature