Rapid Association Test for SNPs and Haplotypes Using Importance
Sampling and Linkage Disequilibrium Decay
Gadi Kimmel, School of Computer Science, TAU
Due to the rapid progress in genotyping techniques, many large-scale,
genome-wide disease association studies are now under way. Typically,
the disorders examined are multi-factorial, and therefore researchers
seeking association between the trait and the loci must consider
interactions among loci and between loci and other factors. One of the
challenges of large disease association studies is obtaining accurate
estimates of the significance of discovered associations.
The linkage disequilibrium between SNPs makes the tests highly
dependent, and dependency worsens when interactions are tested. The standard
way of assigning significance (p-value) is by a permutation test.
Unfortunately, in large studies it is prohibitively slow to compute low
p-values by this method.
We present here a faster algorithm
for calculating the accurate p-value of a case-control association
permutation test. It is based on importance sampling and on accounting
for the decay in linkage disequilibrium along the chromosome.
The algorithm is dramatically faster than the standard permutation
test. For example, when testing single marker-trait association in
simulations with a thousand SNPs and a thousand cases
and controls, it was over 10,000 times faster.
When testing pairwise interactions among 300 SNPs,
our algorithm was about 100,000 times faster.
On 10,000 SNPs from Chromosome 1, a speed-up of 60,000 was achieved.
Our method significantly increases the problem size range for which
accurate, meaningful association results are attainable.
Joint work with Ron Shamir.