STAT928:
Statistical Learning Theory and Sequential Prediction

Spring 2014

Time & Location: Fridays 12:30-3:30pm in JMHH F36

Instructors: Alexander Rakhlin, Karthik Sridharan

Course Description

This course will focus on theoretical aspects of Statistical Learning and Sequential Prediction (Online Learning). In the first part of the course, we will analyze learning with i.i.d. data using classical tools: concentration inequalities, random averages, covering numbers, and combinatorial parameters. We then focus on sequential prediction and develop many of the same tools for learning in this scenario. The latter part is based on recent research and offers many directions for further investigation. The minimax approach, which we emphasize throughout the course, offers a systematic way of comparing learning problems. Beyond the theoretical analysis, we will discuss learning algorithms and, in particular, an important connection between learning and optimization. Our framework will give a handle on developing near-optimal and computationally efficient algorithms. We will illustrate this on the problems of matrix completion, link prediction, and other. Time permitting, we will make excursions into Information Theory and Game Theory, and show how our new tools seamlessly yield a number of interesting results.

Prerequisites: Probability Theory and Linear Algebra.

Lecture Notes

These lecture notes are constantly evolving, so if your version says x<y today, it might say x>y tomorrow.

The "Algorithms" section will go through an overhaul this semester.

Tentative Schedule

Introduction. Overview of Problems in Learning, Estimation, Optimization
Minimax Formulation
Background Material: Stochastic Processes, Empirical Processes, Concentration and Deviation Inequalities
Statistical Learning

Empirical Risk Minimization, Uniform Glivenko-Cantelli classes, Vapnik-Chervonenkis Dimension, Growth Function
Finite Class Lemma, Covering and Packing Numbers, Pollard's Bound
Chaining for Subgaussian Processes, Symmetrization, Rademacher Averages, Dudley's Bound
Combinatorial Dimensions, Vapnik-Chervonenkis-Sauer-Shelah Lemma, Lower Bounds

Sequential Prediction and Decision Making

Prediction with Expert Advice, Exponential Weights Algorithm, Proof of von Neumann's Minimax Theorem
Sequential Minimax Theorem, Dual Representation of the Value, Martingale uGC, Finite Class Lemma
Symmetrization, Sequential Rademacher, Majorization for Martingales
Sequential Covering Numbers, Chaining, Dudley-type Bound
Combinatorial Dimensions, Analogue of V-C-S-S Lemma, Learnability in Supervised Setting
Algorithms for Non-Convex Problems: Halving for Finite Classes, SOA
Algorithms for Convex Problems: Mirror Descent, Follow the Leader, Follow the Regularized Leader

From Sequential to Statistical Learning: Relationship Between the Minimax Values
Optimality of Mirror Descent. Type and M-Type of a Banach Space
Model Selection and Oracle Inequalities in Statistics and Online Learning
Logarithmic Loss: Stochastic and Deterministic Settings, Redundancy-Capacity Theorem
Decision Theory for Individual Sequences: Beyond Regret
Blackwell's Approachability: Two Proofs (Geometric and Minimax)
Prequential Statistics, Calibration of Forecasters, Testing
Algorithmic Stability
Aggregation of Estimators

STAT928:
Statistical Learning Theory and Sequential Prediction

Spring 2014

Time & Location: Fridays 12:30-3:30pm in JMHH F36

Instructors: Alexander Rakhlin, Karthik Sridharan

Course Description

Lecture Notes

Tentative Schedule

Suggested Readings

Articles:

Books (not required):

Other relevant courses:

STAT928: Statistical Learning Theory and Sequential Prediction

Spring 2014

Time & Location: Fridays 12:30-3:30pm in JMHH F36

Instructors: Alexander Rakhlin, Karthik Sridharan

Course Description

Lecture Notes

Tentative Schedule

Suggested Readings

Articles:

Books (not required):

Other relevant courses:

STAT928:
Statistical Learning Theory and Sequential Prediction