Using Microarrays to Microdissect Cancer Genomes
Zohar Yakhini, Agilent Laboratories and the Technion

Alterations in DNA copy numbers are associated with a variety of diseases and conditions, mostly cancer. Comparative genomic hybridization (CGH) is a
technique that is used to evaluate variations in genomic copy number in cells. Recent technology developments introduced an oligonucleotide array
platform for array based comparative genomic hybridization (aCGH) analyses (Barrett et al, PNAS 2004; Brennan et al, Cancer Research 2004; Sebat et al,
Science 2004). This platform provides increased resolution in determining the boundaries of measured genome alterations.

To fully facilitate the use of CGH instrument data in identifying aberrations in tumor cells and to leverage the measurement to gain better
understanding of cancer development and progression we develop statistics, algorithmics, software and visualization tools.

In the talk I will review the biological background of cancer genome instabilities, describe the state of the art measurement technologies and discuss
in details data analysis goals, tasks and solutions.

In particular, we will discuss a statistical framework that enables the
casting of several DNA copy number data analysis questions as
optimization problems over real valued vectors of signals. The
simplest form of the optimization problem seeks to maximize $\score{I}
= \sum v_i / \sqrt{|I|}$ over all subintervals $I$ in the input
vector. We will present and prove a linear time approximation scheme
for this problem. Namely, a process with time complexity
$O\left(n\epsilon^{-2}\right)$ that outputs an interval for which
$\score{I}$ is at least ${\mbox{Opt}}/\alpha(\epsilon)$, where
$\mbox{Opt}$ is the actual optimum and
$\alpha(\epsilon)\rightarrow 0$ as $\epsilon \rightarrow 0$. We will
further describe practical implementations that improve the
performance of the naive quadratic approach by orders of
magnitude.