Leveraging genetic variability across populations for the identi.cation of causal variants

Noah Zaitlen

Abstract:

Genome-wide association studies have been performed extensively in the last few years, resulting in many new discoveries of genomic regions that are associated with complex traits. It is often the case that a single nucleotide polymorphism (SNP) found to be associated with the condition is not the causal SNP, but a proxy to it due to linkage disequilibrium. In order to identify the actual causal SNP, fine mapping follow-ups are performed, either using dense genotyping or by sequencing the region. In either case, if the causal SNP is in high linkage disequilibrium with other SNPs, the fine mapping procedure will require a very large sample size to identify the causal SNP.
Here, we show that by leveraging genetic variability across populations we signi.cantly increase the power to distinguish a causal SNP in a follow-up study that involves multiple populations as opposed to a study that involves only one population. Thus, as opposed to the current practice in which one population is analyzed at a time, our results suggest that the average power of a joint analysis will be higher than the average power of studies that involve only one population.
Based on this observation, we developed a framework to efficiently search for a follow-up study design; our framework searches for the best combination of populations from a pool of available populations, to maximize the power for detection of a causal variant. This framework and its accompanying software can be used to considerably enhance the power of fine-mapping studies.