Abstract
We consider a setting where a designer would like to assess the efficacy of a privacy/fairness scheme by evaluating its performance against a finite-capacity adversary interested in learning a sensitive attribute from released data. Here, a finite-capacity adversary is a learning agent with limited statistical knowledge (finite number of data samples) and limited expressiveness capabilities limited to those expressed by a neural network. We provide probabilistic bounds on the discrepancy between the risk performance of such a finite capacity adversary relative to an infinite capacity adversary for the squared and log-losses, where an infinite-capacity adversary is one with full statistical knowledge and expressiveness capabilities. Our bounds quantify both the generalization error resulting from limited samples and the function approximation limits resulting from finite expressiveness. We illustrate our results for both scalar and multi-dimensional Gaussian mixture models.
Based on joint work with Mario Diaz (ASU/CIMAT), Chong Huang (ASU), and Lalitha Sankar (ASU).