Computational Problems in Modern Human Genetics

Gadi Kimmel
School of Computer Science, TAU

Most of genetic variation among human individuals is due to single nucleotide polymorphisms (SNPs). These are single base genomic sites at which mutations occurred during human history and were passed on through heredity. As a result, two or more different bases (or alleles) are observed across the population at such sites.
Preliminary analyzes have shown that the knowledge of genome variation is expected to play a key role in disease association studies. Associating genome variation with common diseases will shed light on specific areas in the human genome, and direct researchers to study these spots specifically. Hopefully, this will enable to improve diagnosis and to develop novel drugs and other therapies for common diseases, such as cancer and cardio-vascular diseases. Hence, the identification and analysis of SNPs is currently a major goal of the international scientific community.
Through the course of my PhD, we studied several of the major computational problems which arise in the analysis of these new data sets. We used computational techniques from graph theory, probability and statistical theory and integrate them with biological principles to develop models for blocks partitioning, phasing, tag SNPs selection and association studies.

This talk describes the results of my PhD thesis which is done under the supervision of Prof. Ron Shamir.