Using expectation-maximization to infer the early evolution of spliceosomal introns
Liran Carmel, NIH/NLM/NCBI
We propose a detailed model of evolution of exon-intron structure of eukaryotic
genes that takes into account gene-specific intron gain and loss rates,
branch-specific gain and loss coefficients, invariant sites incapable of intron gain,
and rate variability of both gain and loss which is gamma-distributed across sites.
We develop an expectation-maximization algorithm to estimate the parameters of this
model. Using this model, we estimate the intron density of early eukaryotes, and
isolate regions on the eukaryotic phylogenetic tree with high rate of gains or losses.
We are able to reject the intron-early hypothesis, as well as the extreme intron-late
viewpoint. Instead, we show an interesting kaleidoscope of gain and loss events.