Abstract
Reinforcement learning (RL) has made substantial empirical progress in solving hard AI challenges in the past few years. A large fraction of these progresses—Go, Dota 2, Starcraft 2, economic simulation, social behavior learning, and so on—come from multi-agent RL, that is, sequential decision making involving more than one agent. While the theoretical study of single-agent RL has a long history and a vastly growing recent interest, multi-agent RL theory is arguably a newer and less developed field, with its own unique challenges and opportunities.
In this tutorial, we present an overview of recent theoretical developments in multi-agent RL. We will focus on the model of Markov games, and cover basic formulations, objectives, as well as learning algorithms with their theoretical guarantees. We will discuss the inefficiency of classical algorithms---self-play, fictitious play, etc., and then proceed with recent advances in provably efficient learning algorithms under various regimes including two-player zero-sum games, multiplayer general-sum games, exploration, and large state space (function approximation).