Dataset Information

Computational approaches for isoform detection and estimation: good and bad news.

ABSTRACT:

Background

The main goal of the whole transcriptome analysis is to correctly identify all expressed transcripts within a specific cell/tissue--at a particular stage and condition--to determine their structures and to measure their abundances. RNA-seq data promise to allow identification and quantification of transcriptome at unprecedented level of resolution, accuracy and low cost. Several computational methods have been proposed to achieve such purposes. However, it is still not clear which promises are already met and which challenges are still open and require further methodological developments.

Results

We carried out a simulation study to assess the performance of 5 widely used tools, such as: CEM, Cufflinks, iReckon, RSEM, and SLIDE. All of them have been used with default parameters. In particular, we considered the effect of the following three different scenarios: the availability of complete annotation, incomplete annotation, and no annotation at all. Moreover, comparisons were carried out using the methods in three different modes of action. In the first mode, the methods were forced to only deal with those isoforms that are present in the annotation; in the second mode, they were allowed to detect novel isoforms using the annotation as guide; in the third mode, they were operating in fully data driven way (although with the support of the alignment on the reference genome). In the latter modality, precision and recall are quite poor. On the contrary, results are better with the support of the annotation, even though it is not complete. Finally, abundance estimation error often shows a very skewed distribution. The performance strongly depends on the true real abundance of the isoforms. Lowly (and sometimes also moderately) expressed isoforms are poorly detected and estimated. In particular, lowly expressed isoforms are identified mainly if they are provided in the original annotation as potential isoforms.

Conclusions

Both detection and quantification of all isoforms from RNA-seq data are still hard problems and they are affected by many factors. Overall, the performance significantly changes since it depends on the modes of action and on the type of available annotation. Results obtained using complete or partial annotation are able to detect most of the expressed isoforms, even though the number of false positives is often high. Fully data driven approaches require more attention, at least for complex eucaryotic genomes. Improvements are desirable especially for isoform quantification and for isoform detection with low abundance.

SUBMITTER: Angelini C

PROVIDER: S-EPMC4098781 | biostudies-literature | 2014 May

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Computational approaches for isoform detection and estimation: good and bad news.

Angelini Claudia C De Canditiis Daniela D De Feis Italia I

BMC bioinformatics 20140509

<h4>Background</h4>The main goal of the whole transcriptome analysis is to correctly identify all expressed transcripts within a specific cell/tissue--at a particular stage and condition--to determine their structures and to measure their abundances. RNA-seq data promise to allow identification and quantification of transcriptome at unprecedented level of resolution, accuracy and low cost. Several computational methods have been proposed to achieve such purposes. However, it is still not clear w ...[more]

PMID: 24885830

Dataset Information

Computational approaches for isoform detection and estimation: good and bad news.

Background

Results

Conclusions

Publications

Computational approaches for isoform detection and estimation: good and bad news.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

Good News about Bad News: Gamified Inoculation Boosts Confidence and Cognitive Immunity Against Fake News.
| S-EPMC6952868 | biostudies-literature

Dangerous infectious diseases: Bad news for Main Street, good news for Wall Street?
| S-EPMC7148704 | biostudies-literature

Primary Prevention Statins in Older Patients: The Good News or the Bad News First?
| S-EPMC7324097 | biostudies-literature

A data-driven characterisation of natural facial expressions when giving good and bad news.
| S-EPMC7652307 | biostudies-literature

Is bad news on TV tickers good news? The effects of voiceover and visual elements in video on viewers' assessment.
| S-EPMC7159199 | biostudies-literature

Mixed News about the Bad News Game.
| S-EPMC10573624 | biostudies-literature

Good News and Bad News About Incentives to Violate the Health Insurance Portability and Accountability Act (HIPAA): Scenario-Based Questionnaire Study.
| S-EPMC7399953 | biostudies-literature

Computational host range prediction-The good, the bad, and the ugly.
| S-EPMC10868548 | biostudies-literature

Meta-analysis of the alpha/beta ratio for prostate cancer in the presence of an overall time factor: bad news, good news, or no news?
| S-EPMC3556929 | biostudies-literature

MTH1 deficiency selectively increases non-cytotoxic oxidative DNA damage in lung cancer cells: more bad news than good?
| S-EPMC5903006 | biostudies-literature