Abstract

: A reasonable approach to figure this out is to collect training data comprising features of fishermen and their daily catch, and then learn a model mapping fishermen features to the size of their catch. Reasonable as this approach may sound, it will most likely result in a biased model. The reason for this bias is that the training data will miss all those individuals who were not good enough at fishing and decided to become hunters (or do something else) instead. Such self-selection bias is pervasive. From understanding what it takes to be a good college student or company employee to learning from expert demonstrations and understanding strategic behavior in markets, data available for learning statistical models are the results of strategic decisions that have already operated on and filtered out some of the relevant data. I will discuss recent progress on some classical econometric challenges revolving around estimating linear models under self-selection bias, and identification of non-parametric auction models, and present several open directions for future investigation. This talk is based on joint works with Yeshwanth Cherapanamjeri, Andrew Ilyas, Manolis Zampetakis.

Video Recording