Playlist: 20 videos
Learning in the Presence of Strategic Behavior
1:5:0
Panayotis Mertikopoulos (French National Center for Scientific Research (CNRS))
The Limits Of Regularized Learning In Games
Learning in the Presence of Strategic Behavior
This talk focuses on the following question: Does learning from empirical observations in a game-theoretic setting converge to a Nash equilibrium? And, if so, at what rate? A well-known impossibility result in the field precludes the possibility of a "universally positive" answer - i.e., the existence a dynamical process which, based on player-specific information alone, converges to Nash equilibrium in all games. In view of this, we will instead examine the equilibrium convergence properties of a class of widely used no-regret learning processes which includes the exponential weights algorithm and the “follow the regularized leader” family of methods, their optimistic variants, extra-gradient schemes, and many others. In this general context, we establish a comprehensive equivalence between the stability of a Nash equilibrium and its support: a Nash equilibrium is locally stable and attracting if and only if it is strict (i.e., each equilibrium strategy has a unique best response). We will also discuss the robustness of this equivalence to different feedback models - from full information to bandit, payoff-based feedback - as well as the methods' rate of convergence in terms of the underlying regularizer and the type of feedback available to the players.
Visit talk page
The Limits Of Regularized Learning In Games
Learning in the Presence of Strategic Behavior
This talk focuses on the following question: Does learning from empirical observations in a game-theoretic setting converge to a Nash equilibrium? And, if so, at what rate? A well-known impossibility result in the field precludes the possibility of a "universally positive" answer - i.e., the existence a dynamical process which, based on player-specific information alone, converges to Nash equilibrium in all games. In view of this, we will instead examine the equilibrium convergence properties of a class of widely used no-regret learning processes which includes the exponential weights algorithm and the “follow the regularized leader” family of methods, their optimistic variants, extra-gradient schemes, and many others. In this general context, we establish a comprehensive equivalence between the stability of a Nash equilibrium and its support: a Nash equilibrium is locally stable and attracting if and only if it is strict (i.e., each equilibrium strategy has a unique best response). We will also discuss the robustness of this equivalence to different feedback models - from full information to bandit, payoff-based feedback - as well as the methods' rate of convergence in terms of the underlying regularizer and the type of feedback available to the players.
1:6:46
Costantinos Daskalakis (MIT)
https://simons.berkeley.edu/talks/tbd-371
Learning in the Presence of Strategic Behavior
Abstract: A reasonable approach to figure this out is to collect training data comprising features of fishermen and their daily catch, and then learn a model mapping fishermen features to the size of their catch. Reasonable as this approach may sound, it will most likely result in a biased model. The reason for this bias is that the training data will miss all those individuals who were not good enough at fishing and decided to become hunters (or do something else) instead. Such self-selection bias is pervasive. From understanding what it takes to be a good college student or company employee to learning from expert demonstrations and understanding strategic behavior in markets, data available for learning statistical models are the results of strategic decisions that have already operated on and filtered out some of the relevant data. I will discuss recent progress on some classical econometric challenges revolving around estimating linear models under self-selection bias, and identification of non-parametric auction models, and present several open directions for future investigation. This talk is based on joint works with Yeshwanth Cherapanamjeri, Andrew Ilyas, Manolis Zampetakis.
Visit talk page
https://simons.berkeley.edu/talks/tbd-371
Learning in the Presence of Strategic Behavior
Abstract: A reasonable approach to figure this out is to collect training data comprising features of fishermen and their daily catch, and then learn a model mapping fishermen features to the size of their catch. Reasonable as this approach may sound, it will most likely result in a biased model. The reason for this bias is that the training data will miss all those individuals who were not good enough at fishing and decided to become hunters (or do something else) instead. Such self-selection bias is pervasive. From understanding what it takes to be a good college student or company employee to learning from expert demonstrations and understanding strategic behavior in markets, data available for learning statistical models are the results of strategic decisions that have already operated on and filtered out some of the relevant data. I will discuss recent progress on some classical econometric challenges revolving around estimating linear models under self-selection bias, and identification of non-parametric auction models, and present several open directions for future investigation. This talk is based on joint works with Yeshwanth Cherapanamjeri, Andrew Ilyas, Manolis Zampetakis.
1:8:56
Francesca Mollinari (Cornell University)
https://simons.berkeley.edu/talks/identification-analysis-models-set-valued-predictions
Learning in the Presence of Strategic Behavior
Static, simultaneous-move finite games of complete and incomplete information in the presence of multiple equilibria; network formation models; and auction models under weak assumptions on bidding behavior or on bidders’ information, are examples of strategic interaction models that deliver set-valued predictions for the optimal action that each agent can take. In the absence of (typically untestable) assumptions specifying a mechanism to select one prediction out of the set implied by the model, point identification of agents’ preferences based on observing the joint distribution of actions taken (and covariates) cannot be achieved. Rather, there is a set of observationally equivalent preferences that can rationalize the observed data. This talk presents a methodology based on tools from random set theory to characterize the set of observationally equivalent preferences through a collection of moment inequalities, and describes existing approaches for statistical inference based on finite data.
Visit talk page
https://simons.berkeley.edu/talks/identification-analysis-models-set-valued-predictions
Learning in the Presence of Strategic Behavior
Static, simultaneous-move finite games of complete and incomplete information in the presence of multiple equilibria; network formation models; and auction models under weak assumptions on bidding behavior or on bidders’ information, are examples of strategic interaction models that deliver set-valued predictions for the optimal action that each agent can take. In the absence of (typically untestable) assumptions specifying a mechanism to select one prediction out of the set implied by the model, point identification of agents’ preferences based on observing the joint distribution of actions taken (and covariates) cannot be achieved. Rather, there is a set of observationally equivalent preferences that can rationalize the observed data. This talk presents a methodology based on tools from random set theory to characterize the set of observationally equivalent preferences through a collection of moment inequalities, and describes existing approaches for statistical inference based on finite data.
1:6:31
Grant Schoenebeck (University of Michigan)
https://simons.berkeley.edu/talks/mechanisms-procure-information-without-verification
Learning in the Presence of Strategic Behavior
In a variety of contexts such as peer review, peer grading, and crowd-sourcing we would like to reward agents for producing high quality information. Unfortunately, computing rewards by comparing to ground truth or gold standard is often cumbersome, costly, or impossible. Instead we would like to compare agent reports. A key challenge is that agents may strategically withhold effort or information when they believe their payoff will be based upon comparison with other agents whose reports will likely omit this information due to lack of effort or expertise. This talk will show how to translate machine learning solutions, which provide a forecast for an agent's report (given the other agents' reports), into provably incentive compatible mechanisms. The key theoretical technique is a variational interpretation of mutual information, which permits machine learning to estimate mutual information using only a few samples (and likely has other applications in machine learning even beyond strategic settings). Time-permitting, we will show empirically how augmenting information elicitation mechanisms with basic machine learning techniques can improve the ex post fairness of rewards, an important criteria for many applications. (No information theory knowledge will be assumed.)
Visit talk page
https://simons.berkeley.edu/talks/mechanisms-procure-information-without-verification
Learning in the Presence of Strategic Behavior
In a variety of contexts such as peer review, peer grading, and crowd-sourcing we would like to reward agents for producing high quality information. Unfortunately, computing rewards by comparing to ground truth or gold standard is often cumbersome, costly, or impossible. Instead we would like to compare agent reports. A key challenge is that agents may strategically withhold effort or information when they believe their payoff will be based upon comparison with other agents whose reports will likely omit this information due to lack of effort or expertise. This talk will show how to translate machine learning solutions, which provide a forecast for an agent's report (given the other agents' reports), into provably incentive compatible mechanisms. The key theoretical technique is a variational interpretation of mutual information, which permits machine learning to estimate mutual information using only a few samples (and likely has other applications in machine learning even beyond strategic settings). Time-permitting, we will show empirically how augmenting information elicitation mechanisms with basic machine learning techniques can improve the ex post fairness of rewards, an important criteria for many applications. (No information theory knowledge will be assumed.)
1:21:40
Sergiu Hart (Hebrew University of Jerusalem)
https://simons.berkeley.edu/talks/forecast-hedging-calibration-and-game-equilibria
Learning in the Presence of Strategic Behavior
Calibration means that forecasts and average realized frequencies are close. We develop the concept of "forecast hedging", which consists of choosing the forecasts so as to guarantee that the expected track record can only improve. This yields all the calibration results by the same simple basic argument, while differentiating between them by the forecast-hedging tools used: deterministic and fixed point-based versus stochastic and minimax-based. Additional contributions are an improved definition of continuous calibration, ensuing game dynamics that yield Nash equilibria in the long run, and a new calibrated forecasting procedure for binary events that is simpler than all known such procedures.
Visit talk page
https://simons.berkeley.edu/talks/forecast-hedging-calibration-and-game-equilibria
Learning in the Presence of Strategic Behavior
Calibration means that forecasts and average realized frequencies are close. We develop the concept of "forecast hedging", which consists of choosing the forecasts so as to guarantee that the expected track record can only improve. This yields all the calibration results by the same simple basic argument, while differentiating between them by the forecast-hedging tools used: deterministic and fixed point-based versus stochastic and minimax-based. Additional contributions are an improved definition of continuous calibration, ensuing game dynamics that yield Nash equilibria in the long run, and a new calibrated forecasting procedure for binary events that is simpler than all known such procedures.
1:22:20
Georgios Piliouras (Singapore University of Technology and Design)
https://simons.berkeley.edu/talks/optimal-no-regret-learning-general-games-clairvoyant-mwu
Learning in the Presence of Strategic Behavior
In this paper we solve the problem of no-regret learning in general games. Specifically, we provide a novel, simple and efficiently computable learning algorithm, Clairvoyant Multiplicative Weights Updates (CMWU), that achieves constant regret in all normal-form games with fixed step-sizes. The cumulative regret of our algorithm provably decreases linearly as the step-size increases. Our findings depart from the prevailing paradigm that vanishing step-sizes are a prerequisite for low regret as championed by all state-of-the-art methods to date. Our arguments extend well beyond normal-form games with little effort.
Visit talk page
https://simons.berkeley.edu/talks/optimal-no-regret-learning-general-games-clairvoyant-mwu
Learning in the Presence of Strategic Behavior
In this paper we solve the problem of no-regret learning in general games. Specifically, we provide a novel, simple and efficiently computable learning algorithm, Clairvoyant Multiplicative Weights Updates (CMWU), that achieves constant regret in all normal-form games with fixed step-sizes. The cumulative regret of our algorithm provably decreases linearly as the step-size increases. Our findings depart from the prevailing paradigm that vanishing step-sizes are a prerequisite for low regret as championed by all state-of-the-art methods to date. Our arguments extend well beyond normal-form games with little effort.
1:12:55
Drew Fudenberg (MIT)
https://simons.berkeley.edu/talks/predicting-cooperation-rates-learning-model
Learning in the Presence of Strategic Behavior
We use simulations of a simple learning model to predict how cooperation varies with treatment in the experimental play of the indefinitely repeated prisoner's dilemma. We suppose that learning and the game parameters only influence play in the initial round of each supergame, and that after these rounds play depends only on the outcome of the previous round. Using data from 17 papers, we find that our model predicts out-of-sample cooperation at least as well as more complicated models with more parameters and harder-to-interpret machine learning algorithms. Our results let us predict how cooperation rates change with longer experimental sessions, and help explain past findings on the role of strategic uncertainty.
Visit talk page
https://simons.berkeley.edu/talks/predicting-cooperation-rates-learning-model
Learning in the Presence of Strategic Behavior
We use simulations of a simple learning model to predict how cooperation varies with treatment in the experimental play of the indefinitely repeated prisoner's dilemma. We suppose that learning and the game parameters only influence play in the initial round of each supergame, and that after these rounds play depends only on the outcome of the previous round. Using data from 17 papers, we find that our model predicts out-of-sample cooperation at least as well as more complicated models with more parameters and harder-to-interpret machine learning algorithms. Our results let us predict how cooperation rates change with longer experimental sessions, and help explain past findings on the role of strategic uncertainty.
1:7:31
Lillian Ratliff (University of Washington)
Beyond Open Loop Algorithm Design: Learning from Decision-Dependent Data
Learning in the Presence of Strategic Behavior
Learning algorithms are increasingly being deployed in a variety of real world systems. A central tenet of present day machine learning is that when it is arduous to model a phenomenon, observations thereof are representative samples from some, perhaps unknown, static or otherwise independent distribution. In the context of systems such as civil infrastructure and online marketplaces, at least one challenge calls into question the efficacy of this tenet. Data used to either train the algorithms offline or as an input to an online decision-making algorithm may be generated by strategic data sources such as human users. Indeed, such data depends on how the algorithm impacts a user’s individual objectives or (perceived) quality of service. This leads to an interesting feedback mechanism wherein the underlying data distribution depends on the output of the algorithm. This begs the question of how learning algorithms can and should be designed taking into consideration this closed-loop interaction with the environment. In this talk, I will provide one perspective on designing and analyzing algorithms by modeling the underlying learning task in the language of game theory and control, and using tools from these domains to provide performance guarantees on the convergence of algorithms. Recent, promising results in this direction including those on modeling competition between learning algorithms in presence of strategic behavior will be highlighted. Time permitting, open questions will be posed towards the end of the talk.
Visit talk page
Beyond Open Loop Algorithm Design: Learning from Decision-Dependent Data
Learning in the Presence of Strategic Behavior
Learning algorithms are increasingly being deployed in a variety of real world systems. A central tenet of present day machine learning is that when it is arduous to model a phenomenon, observations thereof are representative samples from some, perhaps unknown, static or otherwise independent distribution. In the context of systems such as civil infrastructure and online marketplaces, at least one challenge calls into question the efficacy of this tenet. Data used to either train the algorithms offline or as an input to an online decision-making algorithm may be generated by strategic data sources such as human users. Indeed, such data depends on how the algorithm impacts a user’s individual objectives or (perceived) quality of service. This leads to an interesting feedback mechanism wherein the underlying data distribution depends on the output of the algorithm. This begs the question of how learning algorithms can and should be designed taking into consideration this closed-loop interaction with the environment. In this talk, I will provide one perspective on designing and analyzing algorithms by modeling the underlying learning task in the language of game theory and control, and using tools from these domains to provide performance guarantees on the convergence of algorithms. Recent, promising results in this direction including those on modeling competition between learning algorithms in presence of strategic behavior will be highlighted. Time permitting, open questions will be posed towards the end of the talk.
1:18:31
Tim Roughgarden (Columbia University)
https://simons.berkeley.edu/talks/tbd-368
Learning in the Presence of Strategic Behavior
Visit talk page
https://simons.berkeley.edu/talks/tbd-368
Learning in the Presence of Strategic Behavior
0:56:35
Colin F. Camerer (Caltech)
https://simons.berkeley.edu/talks/tbd-378
Learning in the Presence of Strategic Behavior
The mathematical definition of game-theoretic equilibrium was not particularly motivated by any particular process by which equilibration could occur (though Nash mentioned a “mass action” population dynamics mechanism in his short thesis). Since then, many learning theories have been suggested and studied carefully, mostly in lab experiments. These theories are useful because they can suggest when equilibration is slow or fast, and also might predict which of several multiple equilibria are more likely to occur. This talk will concentrate on reinforcement, fictitious play, and hybrid EWA learning.
Visit talk page
https://simons.berkeley.edu/talks/tbd-378
Learning in the Presence of Strategic Behavior
The mathematical definition of game-theoretic equilibrium was not particularly motivated by any particular process by which equilibration could occur (though Nash mentioned a “mass action” population dynamics mechanism in his short thesis). Since then, many learning theories have been suggested and studied carefully, mostly in lab experiments. These theories are useful because they can suggest when equilibration is slow or fast, and also might predict which of several multiple equilibria are more likely to occur. This talk will concentrate on reinforcement, fictitious play, and hybrid EWA learning.