Abstract

We present a short overview of a few information-theoretic methods for understanding and ensuring fairness in machine learning algorithms. First, we discuss how perturbation of measure approaches (e.g., influence functions) can be used to interpret and correct for bias in a given machine learning model. We then overview how tools from rate-distortion theory may be useful for designing data pre-processing mechanisms for ensuring fairness. Finally, we conclude with future research directions that may be of interest to both data scientists and information theorists.

Video Recording