.
Optimizing the core conserved parts
of orthologous modules.
We wished to
ensure that the differences in the cis-elements
enriched in S. cerevisiae and S. pombe are not an artifact
resulting from the way we identified orthologous
transcription modules and that these differences are identifiable even when
looking for cis-elements only in the promoters
of conserved genes shared by the two modules. We therefore repeated the cis-regulatory analysis using perfectly orthologous transcription modules – in which each gene in
one module is matched by at least one ortholog in the
other module. To generate perfect orthologous modules
from a pair of mutual (non-perfect) orthologous ones,
we enhanced the existing SAMBA algorithm. The modified algorithm is initialized
with a pair of modules and starts by removing all non-orthologous
genes in the two modules. The algorithm then iteratively adds and removes pairs
of orthologous genes to improve the total score of
the module pair. In cases where a gene has more than one ortholog,
the algorithm can add either a single gene pair or a larger orthologous
group of genes, depending on which alternative scores higher. The algorithm
outputs a pair of transcription modules, such that each gene in one module has
at least one ortholog in the other module and such
that additional gene pairs cannot be added or removed without decreasing the
total score of the module pair. Importantly, the results of enriched cis-elements obtained on such perfect orthologous module pairs are consistent with those we
reported for the (non-perfect) orthologous modules. They
thus confirm that our finings on the evolutionary dynamic of cis-regulation were not biased by the imperfect orthology.
Here are browsable examples of the derived core-conserved modules: