Abstract
We revisit the fundamental problem of learning with distribution shift, where a learner is given labeled samples from training distribution D, unlabeled samples from test distribution D′ and is asked to output a classifier with low test error. The standard approach in this setting is to prove a generalization bound in terms of some notion of distance between D and D′. These distances, however, are difficult to compute, and this has been the main stumbling block for efficient algorithm design.
We sidestep this issue and define a new model called TDS learning, where a learner runs a test on the training set and is allowed to reject if this test detects distribution shift relative to a fixed output classifier. Moreover, when the test accepts, the output classifier is guaranteed to have low test error. We will describe how this approach leads to the first set of efficient algorithms for learning well-studied function classes without taking any assumptions on the test distribution. Our techniques touch on a wide array of topics including pseudorandomness, property testing, and sum of squares proofs.
Joint work with Konstantinos Stavropoulos and Arsen Vasilyan