The paper considers learning a binary classifier in a setting where training and test examples can be from arbitrarily different distributions. The authors approach this problem by giving a selective classification algorithm — which returns a classifier and a subset on which the classifier abstains from assigning a label — while incurring few abstentions and few misclassification errors. Authors show that the proposed algorithm achieves optimal guarantees for classes with bounded VC dimension. This is an exciting contribution and a very timely one given the growing interest in robust machine learning and a need to better understand transfer learning. The paper is written clearly, and the results and insights in the paper are compelling. Overall, a good paper. Accept!