Topics in Multiagent Learning

MIT 6.S890 — Fall 2024

While machine learning techniques have had significant success in single-agent settings, an increasingly large body of literature has been studying settings involving several learning agents with different objectives. In these settings, standard training methods, such as gradient descent, are less successful and the simultaneous learning of the agents commonly leads to nonstationary and even chaotic system dynamics.

Motivated by these challenges, this course presents the foundations of multi-agent systems from a combined game-theoretic, optimization and learning-theoretic perspective. We start with basic matrix games, such as rock-paper-scissors, and advance to more complex forms including imperfect information games and structured games like combinatorial games, polymatrix games, and stochastic games. Topics will cover various equilibrium concepts, aspects of equilibrium computation and learning, as well as the computational complexity of finding equilibria. Additionally, we will examine how these models and methods have driven recent breakthroughs in AI, producing human- and superhuman-level agents for well-known games such as Go, Poker, Diplomacy, and Stratego. A tentative course syllabus can be found below.

Course Info

Instructors: Gabriele Farina (✉️ gfarina@)
Time: Tuesdays & Thursdays, 11:00am - 12:30pm
Location: 3-333
TA: Zhiyuan Fan (✉️ fanzy@)
Office hours: see Canvas
Prerequisites: Discrete Mathematics and Algorithms at the advanced undergraduate level; mathematical maturity.
This course will use Canvas

Course Structure

The course will be lecture-based. At the end of the course there will be a few lectures of project presentations by students. Readings will consist of a mixture of textbooks and course notes, which will be uploaded after lectures.

This course will include 2-3 homework sets and a project presentation. Projects may be done individually or in groups of 2-3 students and can be theoretical or experimental. Grading will be as follows:

50% final project
50% homework sets

Schedule (subject to change)

#	Date	Topic	Readings	Lecture notes
1	9/5	Introduction to the course and logistics	Syllabus	Slides
Part I: Normal-form games
2	9/10	Setting and equilibria: the Nash equilibrium Definition of normal-form games. Solution concepts and Nash equilibrium. Nash equilibrium existence theorem. Brouwer's fixed point theorem.	Nash '51	Notes html \| pdf
3	9/12	Setting and equilibria: the correlated equilibrium Topological and computational properties of the set of Nash equilibria in normal-form games. Connections with linear programming. Definition of correlated and coarse correlated equilibria; relationships with Nash equilibria.	von Stengel '24	Notes html \| pdf
4	9/17 HW1	Learning in games: Foundations Regret and hindsight rationality. Definition of regret minimization and relationships with equilibrium concepts.	Intro of BM'07	Notes html \| pdf
5	9/19	Learning in games: Algorithms (part I) General principles in the design of learning algorithms. Follow-the-leader, regret matching, multiplicative weights update, online mirror descent.		Notes html \| pdf
6	9/24	Learning in games: Algorithms (part II) Optimistic mirror descent and optimistic follow-the-regularized-leader. Accelerated computation of approximate equilibria.	SALS'15	Notes html \| pdf
7	9/26	Learning in games: Bandit feedback From multiplicative weights to Exp3. General principles. Obtaining high-probability bounds.	ZS'21	Notes html \| pdf
8	10/1	Learning in games: Φ-regret minimization Blum-Mansour construction of a swap regret minimizer. Gordon, Greenwald, and Marks general construction for Φ-regret minimizers.	GGM'08 \| BM'07	Notes html \| pdf
Part II: Extensive-form games
9	10/3	Foundations of extensive-form games Complete versus imperfect information. Kuhn's theorem. Normal-form and sequence-form strategies. Similarities and differences with normal-form games.		Notes html \| pdf
10	10/8 HW2	Learning in extensive-form games No-regret algorithms for extensive-form games. Counterfactual utilities and counterfactual regret minimization (CFR).		Notes html \| pdf
11	10/10	Equilibrium refinements Sequential irrationality. Extensive-form perfect equilibria and quasi-perfect equilibrium.		Notes html \| pdf
12	10/15	holiday No class Student holiday
13	10/17	project Project ideas and brainstorming
14	10/22	project Project break Coincides with INFORMS 2024 Annual Meeting
15	10/24	project Project break Coincides with INFORMS 2024 Annual Meeting
16	10/29	Deep reinforcement learning for large-scale games (part I) (Guest lecture by Samuel Sokota) Rough taxonomy of deep RL methods for games. Deep versions of fictitious play, CFR, double oracle. Game-theoretic version of PPO. Magnetic mirror descent.		Slides
17	10/31	Deep reinforcement learning for large-scale games (part II) (Guest lecture by Samuel Sokota) Public belief states. Applications to poker, Hanabi, team games.		Slides
Part III: Other structured games
18	11/5	Combinatorial games and Kernelized MWU (Part I) Example of combinatorial games. Kernelized multiplicative weights update algorithm.	FLLK'22	Notes html \| pdf
19	11/7	Combinatorial games and Kernelized MWU (Part II) Example of combinatorial games. Kernelized multiplicative weights update algorithm.		Notes html \| pdf
20	11/12 HW3	Computation of exact equilibria A second look at the minimax theorem. Hart and Schmeidler's proof of existence of correlated equilibria. Ellipsoid against hope algorithm.	PR'08 \| JLB'15 \| FP'24	Notes html \| pdf
21	11/14	Markov (aka stochastic) games Setting and taxonomy of equilibria. Stationary Markov Nash equilibria in inifinite-horizon games. Shapley's minimax theorem.		Notes html \| pdf
Part IV: Complexity of equilibrium computation
22	11/19	PPAD-completeness of Nash equilibria (part I) Sperner's lemma. The PPAD complexity class. Nash ∈ PPAD.		Notes html \| pdf
23	11/21	PPAD-completeness of Nash equilibria (part II) Arithmetic circuit SAT. PPAD-hardness of Nash equilibria.		Notes html \| pdf
Project break & presentations
24	11/26	project Project break
25	11/28	holiday No class Thanksgiving
26	12/3	project Project presentations		Phantom Tic-Tac-Toe challenge
27	12/5	project Project presentations

Below is a list of related courses at other schools.

Instructor(s)	Title	Year	School
Farina & Daskalakis	Topics in Multiagent Learning (Fall 2023 version of this course)	2023	MIT
Farina & Sandholm	Computational Game Solving	2021	CMU
Aaron Roth	Learning in Games (and Games in Learning)	2023	UPenn
Christian Kroer	Economics, AI, and Optimization	2020	Columbia
Tuomas Sandholm	Foundations of Electronic Marketplaces	2015	CMU
John P. Dickerson	Applied Mechanism Design for Social Good	2018	UMD
Fei Fang	Artificial Intelligence Methods for Social Good	2018	CMU
Constantinos Daskalakis	Games, Decision, and Computation	2015	MIT
Yiling Chen	Topics at the Interface between Computer Science and Economics	2016	Harvard
Vincent Conitzer	Computational Microeconomics: Game Theory, Social Choice, and Mechanism Design	2016	Duke
Ariel Procaccia	Truth, Justice, and Algorithms	2016	CMU
Tim Roughgarden	Algorithmic Game Theory	2013	Stanford

Topics in Multiagent Learning

Course Info

Course Structure

Schedule (subject to change)

Related Courses