Unknown

Dataset Information

0

Revealing modularity and organization in the yeast molecular network by integrated analysis of highly heterogeneous genomewide data.


ABSTRACT: The dissection of complex biological systems is a challenging task, made difficult by the size of the underlying molecular network and the heterogeneous nature of the control mechanisms involved. Novel high-throughput techniques are generating massive data sets on various aspects of such systems. Here, we perform analysis of a highly diverse collection of genomewide data sets, including gene expression, protein interactions, growth phenotype data, and transcription factor binding, to reveal the modular organization of the yeast system. By integrating experimental data of heterogeneous sources and types, we are able to perform analysis on a much broader scope than previous studies. At the core of our methodology is the ability to identify modules, namely, groups of genes with statistically significant correlated behavior across diverse data sources. Numerous biological processes are revealed through these modules, which also obey global hierarchical organization. We use the identified modules to study the yeast transcriptional network and predict the function of >800 uncharacterized genes. Our analysis framework, SAMBA (Statistical-Algorithmic Method for Bicluster Analysis), enables the processing of current and future sources of biological information and is readily extendable to experimental techniques and higher organisms.

SUBMITTER: Tanay A 

PROVIDER: S-EPMC365731 | biostudies-literature | 2004 Mar

REPOSITORIES: biostudies-literature

altmetric image

Publications

Revealing modularity and organization in the yeast molecular network by integrated analysis of highly heterogeneous genomewide data.

Tanay Amos A   Sharan Roded R   Kupiec Martin M   Shamir Ron R  

Proceedings of the National Academy of Sciences of the United States of America 20040218 9


The dissection of complex biological systems is a challenging task, made difficult by the size of the underlying molecular network and the heterogeneous nature of the control mechanisms involved. Novel high-throughput techniques are generating massive data sets on various aspects of such systems. Here, we perform analysis of a highly diverse collection of genomewide data sets, including gene expression, protein interactions, growth phenotype data, and transcription factor binding, to reveal the  ...[more]

Similar Datasets

| S-EPMC4403033 | biostudies-other
| S-EPMC8223753 | biostudies-literature
| S-EPMC4416014 | biostudies-literature
| S-EPMC2065896 | biostudies-literature
| S-EPMC2065901 | biostudies-literature
| S-EPMC3226253 | biostudies-literature
| S-EPMC1277257 | biostudies-literature
| S-EPMC5039412 | biostudies-literature
| S-EPMC3130676 | biostudies-literature
| S-EPMC6694132 | biostudies-literature