course number | instructor | title |
CS 6104 | B Ji | Online Learning and Sequential Decision Making |
Course Description
This course will be focused on a rigorous treatment of online learning and sequential decision making, which has a wide range of applications in various domains, such as recommender systems, internet advertising, self-driving cars, investment portfolio selection, clinical trials, hyperparameter tuning of neural networks, and playing games such as Go and Atari. We will cover several classic paradigms of online learning and sequential decision making, including online convex optimization, multi-armed bandits, Bayesian optimization, and reinforcement learning. These paradigms capture a key dilemma the decision maker faces: choosing with the best option that has been found so far, or exploring new options to gather more information; that is, how to address the fundamental tradeoff between exploitation and exploration? The emphasis of the course will be on discussing recent advances in algorithm design along with mathematical techniques needed for theoretical performance analysis.
Prerequisites
The prerequisites include basic knowledge of probability theory and linear algebra. A certain level of mathematical maturity is expected. We will provide a brief review on key mathematical techniques when they are needed in the analysis. The grading is based on a summary of selected papers and a final project (including a report and a presentation).
Reference:
Online Learning and Online Convex Optimization, by Shai Shalev-Shwartz Link: https://www.cs.huji.ac.il/w~shais/papers/OLsurvey.pdf
Introduction to Multi-Armed Bandits, by Aleksandrs Slivkins Link: https://arxiv.org/pdf/1904.07272.pdf
Bandit Algorithms, by Tor Lattimore and Csaba Szepesvari Link: https://tor-lattimore.com/downloads/book/book.pdf
Reinforcement learning: An Introduction, by Richard S. Sutton and Andrew Barto Link: http://incompleteideas.net/book/the-book.html
Selected papers from top-tier machine learning conferences: NeurIPS, ICML, AAAI, COLT, ICLR, AISTATS, IJCAI, UAI .