A Sub-quadratic Sequence Alignment Algorithm for Unrestricted Scoring Matrices

Michal Ziv-Ukelson
Department of Computer Science, Technion, Israel.

The classical algorithm for computing the similarity between two
sequences uses a dynamic programming matrix, and compares two strings of
size n in O(n^2) time. In this talk we address the challenge of
computing the similarity of two strings in sub-quadratic time, for
metrics which use a scoring matrix of unrestricted weights. Our
algorithm applies to both local and global similarity computations.

The speed-up is achieved by dividing the dynamic programming matrix into
variable sized blocks, as induced by Lempel-Ziv parsing of both strings,
and utilizing the inherent periodic nature of both strings. This leads
to an O(n^2 / log n) algorithm for an input of constant alphabet size.
For most texts, the time complexity is actually O(h*n^2 /log n) where h
<= 1 is the entropy of the text.

The paper is joint work with Gad M. Landau and Maxime Crochemore.

The above talk will be followed by a brief description of selected
recent results, (a RECOMB 2005 paper, a WABI 2005 paper, a CPM 2005
paper and a SODA 2006 submission, joint work in parts with C.Ben-Zaken
Zilberstein, C. Kent, F. Swidan, G.M. Landau, Z. Yakhini and R.Y.
Pinter), to be further elaborated according to the interest expressed by
the audience.