Unknown

Dataset Information

0

Data-based filtering for replicated high-throughput transcriptome sequencing experiments.


ABSTRACT:

Motivation

RNA sequencing is now widely performed to study differential expression among experimental conditions. As tests are performed on a large number of genes, stringent false-discovery rate control is required at the expense of detection power. Ad hoc filtering techniques are regularly used to moderate this correction by removing genes with low signal, with little attention paid to their impact on downstream analyses.

Results

We propose a data-driven method based on the Jaccard similarity index to calculate a filtering threshold for replicated RNA sequencing data. In comparisons with alternative data filters regularly used in practice, we demonstrate the effectiveness of our proposed method to correctly filter lowly expressed genes, leading to increased detection power for moderately to highly expressed genes. Interestingly, this data-driven threshold varies among experiments, highlighting the interest of the method proposed here.

Availability

The proposed filtering method is implemented in the R package HTSFilter available on Bioconductor.

SUBMITTER: Rau A 

PROVIDER: S-EPMC3740625 | biostudies-literature | 2013 Sep

REPOSITORIES: biostudies-literature

altmetric image

Publications

Data-based filtering for replicated high-throughput transcriptome sequencing experiments.

Rau Andrea A   Gallopin Mélina M   Celeux Gilles G   Jaffrézic Florence F  

Bioinformatics (Oxford, England) 20130702 17


<h4>Motivation</h4>RNA sequencing is now widely performed to study differential expression among experimental conditions. As tests are performed on a large number of genes, stringent false-discovery rate control is required at the expense of detection power. Ad hoc filtering techniques are regularly used to moderate this correction by removing genes with low signal, with little attention paid to their impact on downstream analyses.<h4>Results</h4>We propose a data-driven method based on the Jacc  ...[more]

Similar Datasets

| S-EPMC2906865 | biostudies-literature
| S-EPMC4357712 | biostudies-literature
| S-EPMC7160550 | biostudies-literature
| S-EPMC5333443 | biostudies-literature
| S-EPMC3487918 | biostudies-literature
| S-EPMC4770208 | biostudies-literature
| S-EPMC7763492 | biostudies-literature
| S-EPMC3832420 | biostudies-literature
| S-EPMC3083090 | biostudies-literature
| S-EPMC4629888 | biostudies-literature