Revealing Modularity and Organization in the Yeast Molecular Network by Integrated Analysis of Highly Heterogeneous Genome-wide Data

Amos Tanay, Roded Sharan, Martin Kupiec and Ron Shamir

Proc. National Academy of Science USA 101 (9) 2981-2986 (March 2004). pdf

The dissection of complex biological systems is a challenging task, made difficult by the size of the underlying molecularnetwork and by the heterogeneous nature of the control mechanisms involved. Novel high throughput techniques aregenerating massive datasets on various aspects of such systems. Here we perform the first analysis of a highly diverse collection of genome-wide datasets, including gene expression, protein interactions, growth phenotype data and transcription factor binding, in order to reveal the modular organization of the yeast system. By integrating experimental data of heterogeneous sources and types, we are able to perform analysis on a much broader scope than previous studies. At the core of our methodology is the ability to identify modules, namely, groups of genes with statistically significant correlated behavior across diverse data sources. Numerous biological processes are revealed through these modules, which also obey global hierarchical organization. We use the identified modules to study the yeast transcriptional network and also to predict the function of over 800 uncharacterized genes. Our analysis framework, entitled SAMBA, enables the processing of current as well as future sources of biological information, and is readily extendable to novel experimental techniques and higher organisms.

The Samba algorithm is available here both as a stand alone executable or as part of the Expander software suite.

fragment of modules map

Shamir's Lab Home Page

Kupiec's Lab Home Page

for comments or questions: Amos Tanay (

Site Meter