Unknown

Dataset Information

0

A Bayesian Semi-parametric Approach for the Differential Analysis of Sequence Counts Data.


ABSTRACT: Data obtained using modern sequencing technologies are often summarized by recording the frequencies of observed sequences. Examples include the analysis of T cell counts in immunological research and studies of gene expression based on counts of RNA fragments. In both cases the items being counted are sequences, of proteins and base pairs, respectively. The resulting sequence-abundance distribution is usually characterized by overdispersion. We propose a Bayesian semi-parametric approach to implement inference for such data. Besides modeling the overdispersion, the approach takes also into account two related sources of bias that are usually associated with sequence counts data: some sequence types may not be recorded during the experiment and the total count may differ from one experiment to another. We illustrate our methodology with two data sets, one regarding the analysis of CD4+ T cell counts in healthy and diabetic mice and another data set concerning the comparison of mRNA fragments recorded in a Serial Analysis of Gene Expression (SAGE) experiment with gastrointestinal tissue of healthy and cancer patients.

SUBMITTER: Guindani M 

PROVIDER: S-EPMC4017673 | biostudies-literature | 2014 Apr

REPOSITORIES: biostudies-literature

altmetric image

Publications

A Bayesian Semi-parametric Approach for the Differential Analysis of Sequence Counts Data.

Guindani Michele M   Sepúlveda Nuno N   Paulino Carlos Daniel CD   Müller Peter P  

Journal of the Royal Statistical Society. Series C, Applied statistics 20140401 3


Data obtained using modern sequencing technologies are often summarized by recording the frequencies of observed sequences. Examples include the analysis of T cell counts in immunological research and studies of gene expression based on counts of RNA fragments. In both cases the items being counted are sequences, of proteins and base pairs, respectively. The resulting sequence-abundance distribution is usually characterized by overdispersion. We propose a Bayesian semi-parametric approach to imp  ...[more]

Similar Datasets

| S-EPMC2876132 | biostudies-literature
| S-EPMC6798423 | biostudies-literature
| S-EPMC8241860 | biostudies-literature
| S-EPMC4111556 | biostudies-literature
| S-EPMC3471932 | biostudies-literature
| S-EPMC5400113 | biostudies-literature
| S-EPMC4112276 | biostudies-literature