Unknown

Dataset Information

0

Cluster analysis of gene expression dynamics.


ABSTRACT: This article presents a Bayesian method for model-based clustering of gene expression dynamics. The method represents gene-expression dynamics as autoregressive equations and uses an agglomerative procedure to search for the most probable set of clusters given the available data. The main contributions of this approach are the ability to take into account the dynamic nature of gene expression time series during clustering and a principled way to identify the number of distinct clusters. As the number of possible clustering models grows exponentially with the number of observed time series, we have devised a distance-based heuristic search procedure able to render the search process feasible. In this way, the method retains the important visualization capability of traditional distance-based clustering and acquires an independent, principled measure to decide when two series are different enough to belong to different clusters. The reliance of this method on an explicit statistical representation of gene expression dynamics makes it possible to use standard statistical techniques to assess the goodness of fit of the resulting model and validate the underlying assumptions. A set of gene-expression time series, collected to study the response of human fibroblasts to serum, is used to identify the properties of the method.

SUBMITTER: Ramoni MF 

PROVIDER: S-EPMC123104 | biostudies-literature | 2002 Jul

REPOSITORIES: biostudies-literature

altmetric image

Publications

Cluster analysis of gene expression dynamics.

Ramoni Marco F MF   Sebastiani Paola P   Kohane Isaac S IS  

Proceedings of the National Academy of Sciences of the United States of America 20020624 14


This article presents a Bayesian method for model-based clustering of gene expression dynamics. The method represents gene-expression dynamics as autoregressive equations and uses an agglomerative procedure to search for the most probable set of clusters given the available data. The main contributions of this approach are the ability to take into account the dynamic nature of gene expression time series during clustering and a principled way to identify the number of distinct clusters. As the n  ...[more]

Similar Datasets

| S-EPMC3294239 | biostudies-literature
2017-08-16 | MSV000081443 | MassIVE
| S-EPMC3224192 | biostudies-literature
| S-EPMC7529771 | biostudies-literature
| S-EPMC4015031 | biostudies-literature
| S-EPMC6333964 | biostudies-literature
| S-EPMC6801048 | biostudies-literature
| S-EPMC4194701 | biostudies-literature
| S-EPMC3122889 | biostudies-literature