Please note that the schedule is currently tentative and is subject to change in the next few weeks.

There is a total of 39 accepted papers: 4 oral presentations, and 35 posters.

Time Type Duration Title & Speaker / Author(s)

9:00 - 10:30 Tutorial 10 min Intro + Tutorial, part 1: RL in games, from Checkers to AlphaZero, foundations.
Marc Lanctot.
Tutorial 20 min Tutorial, part 2: RL in Markov games.
Julien Pérolat.
Tutorial 20 min Tutorial, part 3: RL (and search) in imperfect information games.
Martin Schmid.
Oral 20 min Towards Optimal Play of Three-Player Piglet and Pig.
Todd Neller, François Bonnet and Simon Viennot.
Poster 20+ min Poster Session #1


11:00-12:30 Invited Talk 30 min Learning in Zero-Sum Games Revisited: From von Neumann to Poincaré, Hamilton and Legendre.
Georgios Piliouras
Oral 20 min Finite-Time Convergence of Gradient-Based Learning in Continuous Games.
Benjamin Chasnov, Lillian J. Ratliff, Daniel Calderone, Eric Mazumdar, and Samuel A. Burden.
Oral 20 min Heterogeneous Knowledge Transfer via Hierarchical Teaching in Cooperative Multiagent Reinforcement Learning.
Dong-Ki Kim, Miao Liu, Shayegan Omidshafiei, Sebastian Lopez-Cot, Matthew Riemer, Gerald Tesauro, Murray Campbell, Golnaz Habibi and Jonathan How.
Poster 20+ min Poster Session #2

Lunch Break

2:00-3:15 Invited Talk 30 min Beyond classical bandit tools for Monte-Carlo Tree Search.
Emilie Kaufmann
Oral 20 min Learning Policies from Human Data for Skat.
Douglas Rebstock, Chris Solinas and Michael Buro.
Poster 20+ min Poster Session #3


4:00-5:00 Invited Talk 30 min Making Decisions in General Sum Environments.
Michael Littman.
Panel 30 min Panel + Discussion

Poster Session #1 (Starts at 10:10)

Deep Reinforcement Learning for Green Security Games with Real-Time Information.
Yufei Wang, Zheyuan Ryan Shi, Lantao Yu, Yi Wu, Rohit Singh, Lucas Joppa and Fei Fang.

Solving Imperfect-Information Games via Discounted Regret Minimization.
Noam Brown and Tuomas Sandholm.

Solving Large Sequential Games with the Excessive Gap Technique.
Christian Kroer, Gabriele Farina and Tuomas Sandholm.

Combining No-regret and Q-learning.
Ian Kash and Katja Hofmann.

Pommerman: A Multi-Agent Playground.
Cinjon Resnick, Kyunghyun Cho, Denny Britz, Joan Bruna, Julian Togelius, Wes Eldridge, David Ha and Jakob Foerster.

Neural Fictitious Self-Play on ELF Mini-RTS.
Keigo Kawamura and Yoshimasa Tsuruoka.

Winning Isn't Everything: Training Agents to Playtest Modern Games.
Igor Borovikov, Yunqi Zhao, Ahmad Beirami, Jesse Harder, John Kolen, James Pestrak, Jervis Pinto, Reza Pourabolghasem, Harold Chaput, Mohsen Sardari, Long Lin, Navid Aghdaie and Kazi Zaman.

Double Neural Counterfactual Regret Minimization.
Hui Li, Kailiang Hu, Zhibang Ge, Tao Jiang and Le Song.

Coordinating the Crowd: Inducing Desirable Equilibria in Non-Cooperative Systems.
David Mguni, Joel Jennings, Sergio Valcarcel Macua, Emilio Sison, Sofia Ceppi and Enrique Munoz de Cote.

Robust Multi-Agent Reinforcement Learning via Minimax Deep Deterministic Policy Gradient.
Shihui Li, Yi Wu, Xinyue Cui, Honghua Dong, Fei Fang and Stuart Russell.

Approximation Gradient Error Variance Reduced Optimization.
Weiye Zhao, Yang Liu, Xiaoming Zhao, Jielin Qiu and Jian Peng.

A Meta-MDP Approach to Improve Exploration in Reinforcement Learning.
Francisco Garcia and Philip Thomas.

Poster Session #2 (Starts at 12:10)

Application of self-play deep reinforcement learning to "Big 2", a four-player game of imperfect information.
Henry Charlesworth.

Online Convex Optimization for Sequential Decision Processes and Extensive-Form Games.
Gabriele Farina, Christian Kroer and Tuomas Sandholm.

Ex ante coordination in team games.
Gabriele Farina, Andrea Celli, Nicola Gatti and Tuomas Sandholm.

Single-Agent Policy Tree Search With Guarantees.
Laurent Orseau, Levi Lelis, Tor Lattimore and Théophane Weber.

Ranked Reward: Enabling Self-Play Reinforcement Learning for Bin packing.
Alexandre Laterre, Yunguan Fu, Mohamed Khalil Jabri, Alain-Sam Cohen, David Kas, Karl Hajjar, Hui Chen, Torbjorn Dahl, Amine Kerkeni and Karim Begiur.

Depth-Limited Solving for Imperfect-Information Games.
Noam Brown, Tuomas Sandholm and Brandon Amos.

Combo-Action: Training Agent For FPS Game with Auxiliary Tasks.
Shiyu Huang.

Thompson Sampling for Pursuit-Evasion Problems.
Zhen Li, Nick Meyer, Eric Laber and Robert Brigantic.

Challenges of Context and Time in Reinforcement Learning: Introducing Space Fortress as a Benchmark.
Akshat Agarwal, Ryan Hope and Katia Sycara.

Learning Task-Specific Representations of Environment Models in Deep Reinforcement Learning.
Yota Mizutani and Yoshimasa Tsuruoka.

Using Monte Carlo Tree Search as a Demonstrator within Asynchronous Deep RL.
Bilal Kartal, Pablo Hernandez Leal and Matthew Edmund Taylor.

Illuminating Generalization in Deep Reinforcement Learning through Procedural Level Generation.
Niels Justesen, Ruben Torrado, Philip Bontrager, Ahmed Khalifa, Julian Togelius and Sebastian Risi.

Poster Session #3 (Starts at 2:55)

Backplay: 'Man muss immer umkehren'.
Cinjon Resnick, Roberta Raileanu, Sanyam Kapoor, Alexander Peysakhovich, Kyunghyun Cho and Joan Bruna.

Evolutionarily-Curated Curriculum Learning for Deep Reinforcement Learning Agents.
Michael Green, Benjamin Sergent, Pushyami Panindrashayi Shandilya and Vibhor Kumar.

Quasi-Perfect Stackelberg Equilibrium.
Alberto Marchesi, Gabriele Farina, Christian Kroer, Nicola Gatti and Tuomas Sandholm.

A Framework for Macro Discovery for Efficient State-Set Exploration.
Francisco Garcia and Bruno Castro Da Silva.

Deep Counterfactual Regret Minimization.
Noam Brown, Adam Lerer, Sam Gross and Tuomas Sandholm.

Composability of Regret Minimizers.
Gabriele Farina, Christian Kroer and Tuomas Sandholm.

Practical Exact Algorithm for Trembling-Hand Equilibrium Refinements in Games.
Gabriele Farina, Nicola Gatti and Tuomas Sandholm.

Learning to Assign Credit in Reinforcement Learning by Incorporating Abstract Relations.
Dong Yan, Hang Su and Jun Zhu.

Confidence-Based Aggregation of Multi-Step Returns for Reinforcement Learning.
Girish Jeyakumar and Balaraman Ravindran.

Learning Self-Game-Play Agents for Combinatorial Optimization Problems.
Ruiyang Xu and Karl Lieberherr.

Internal Model from Observations for Reward Shaping.
Daiki Kimura, Subhajit Chaudhury, Ryuki Tachibana and Sakyasingha Dasgupta.