|
|
| Academic Year |
2024/2025 |
| Name |
Dott. - MI (1380) Ingegneria dell'Informazione / Information Technology |
| Programme Year |
1 |
| ID Code |
062774 |
| Course Title |
MULTI-AGENT LEARNING: FROM THEORY TO PRACTICE |
| Course Type |
MONO-DISCIPLINARY COURSE |
| Credits (CFU / ECTS) |
5.0 |
| Course Description |
"The aim of this course is to provide a broad overview of recent theoretical and practical advancements in multi-agent learning. Multi-agent learning studies Artificial Intelligence techniques enabling the design of artificial agents capable of taking autonomous decisions in complex competitive and cooperative scenarios involving multiple agents, either other artificial agents or human beings. Dealing with such multi-agent settings requires different techniques from those that are usually employed for classical single-agent machine learning problems (such as, e.g., reinforcement learning). In particular, multi-agent learning relies on notions taken from game theory and online convex optimization. The former provides theoretically grounded behavior prescriptions for taking decisions in multi-agent scenarios, while the latter enables agents to learn from experience in uncertain environments.
The course is split into two parts. The first one is devoted to providing the theoretical groundings on game theory and online convex optimization that are necessary for the rest of the course. The second part of the course presents how the tools introduced in the first part can be employed to design multi-agent learning algorithms for a variety of problems, such as multi-agent sequential decision-making processes, economic scenarios, and routing problems.
The following is a detailed description of the topics addressed by the course:
Theoretical groundings
1.1) Game theoretical models
1.1.1) Game representations (games with simultaneous moves, games with sequential moves, succinct games)
1.1.2) Solution concepts (Nash equilibrium, Stackelberg equilibrium, correlated equilibrium)
1.1.3) Zero-sum games
1.2) Online convex optimization
1.2.1) Definition of an online learning problem (adversarial/stochastic environments, regret minimization, parallelism with repeated games)
1.2.2) Special case: online learning with expert advice
1.2.3) Online gradient descent (algorithm, regret bounds)
1.2.4) Online mirror descent (algorithm, regret bounds)
1.2.5) Learning with partial feedback (multi-armed bandits, bandit convex optimization)
Learning in multi-agent environments
2.1) Learning equilibria in games with simultaneous moves
2.1.1) Regret minimization and Nash equilibria in zero-sum games
2.1.2) Regret matching (algorithm, Blackwell approachability game)
2.1.3) Learning correlated equilibria
2.2) Learning equilibria in games with sequential moves
2.2.1) Decomposing regret on the game tree (sequence form, regret circuits)
2.2.2) Counterfactual regret minimization (algorithm, regret bounds)
2.2.3) Deep learning and counterfactual regret minimization
2.3) Learning in economic scenarios
2.3.1) Auction settings
2.3.2) Learning in auctions with online gradient descent
2.4) Learning in routing and security problems
2.4.1) Adaptive routing and online linear optimization
2.4.2) Security games and multi-armed bandits
2.4.3) Dealing with partial feedback
2.5) Learning with humans in the loop
2.5.1) Dealing with constraints motivated by humans
2.5.2) Applications to human teaching" |
| Scientific-Disciplinary Sector (SSD)
|
|
SSD Code
|
SSD Description
|
CFU
|
|
ING-INF/05
|
INFORMATION PROCESSING SYSTEMS
|
5.0
|
|
|
Alphabetical group
|
Name
|
Teaching Assignment Details
|
|
From (included)
|
To (excluded)
|
|
A
|
ZZZZ
|
Marchesi Alberto, Castiglioni Matteo
|
|
|