Outcome prediction in breast cancer: Machine Learning vs Biology

Eytan Domany
Dept of Physics of Complex Systems
Weizmann Institute of Science

Considerable effort has been devoted during the recent five years to identify a gene expression signature that predicts outcome of early-discovery breast cancer. Different groups used different cohorts of patients and different DNA microarrays to produce short-lists of predictive genes, and reported high success rates.

I will review some of this work, point out problematic aspects of it and present PAC-ranking, a method designed to estimate the number of training samples needed to produce a robust predictive gene list.

If time permits, I will describe briefly an alternative, biology-based approach to outcome prediction.