9.520: Statistical Learning Theory and Applications, Spring 2010

Class Times: Monday and Wednesday 10:30-12:00

Units: 3-0-9 H,G

Location: 46-5193

Instructors: Tomaso Poggio (TP), Lorenzo Rosasco (LR), Charlie Frogner (CF)

TA: Andre Wibisono

Office Hours: Friday 1-2 pm in 46-5165

Email Contact : 9.520@mit.edu

Previous Class: SPRING 09

Course Description

Prerequisites

Grading

Scribe Notes

Problem Sets

Projects

Syllabus

Reading List

Sign up for final project presentation here.

Course description

Focuses on the problem of supervised and unsupervised learning from the perspective of modern statistical learning theory, starting with the theory of multivariate function approximation from sparse data. Develops basic tools such as regularization, including support vector machines for regression and classification. Derives generalization bounds using stability. Discusses current research topics such as manifold regularization, sparsity, feature selection, Bayesian connections and techniques, and online learning. Emphasizes applications in several areas: computer vision, speech recognition, and bioinformatics. Discusses advances in the neuroscience of the cortex and their impact on learning theory and applications. Requirements for grading (other than attending lectures) are: scribing one lecture, 2 problems sets, final project.

Prerequisites
6.867 or permission of instructor. In practice, a substantial level of mathematical maturity is necessary. Familiarity with probability and functional analysis will be very helpful. We try to keep the mathematical prerequisites to a minimum, but we will introduce complicated material at a fast pace.
Grading
Requirements for grading (other than attending lectures) are: scribing one lecture, 2 problems sets, final project.

Scribe Notes

In this class we will scribe 13 lectures: lectures #2 - #11, and lectures #14 - #16. Each student who is taking this class for credit will be required to scribe one lecture. Scribe notes should be a natural integration of the presentation of the lectures with the material in the slides. The lecture slides are available on this website for your reference. Good scribe notes are important, both for your grades and for other students to read. In particular, please make an effort to present the material in a clear, concise, and comprehensive manner. You can look at the lecture notes of 6.854/18.415 (Fall 2008) on OCW to see examples of the expected format and quality of scribe notes.

Scribe notes must be prepared with Latex, using this template. Scribe notes (.tex file and all additional files) should be submitted to 9.520@mit.edu no later than one week after the class. Please make sure to proofread the notes carefully before submitting. We will review the scribe notes to check the technical content and quality of writing. We will also give feedback and ask for a revised version if necessary. Completed scribe notes will be posted on this website as soon as possible.

You can sign up here for scribe notes. If you have problems opening or editing the page, please send us an email at 9.520@mit.edu. In addition, if you have any questions of concerns about the scribing requirement, please feel free to send us an email.

Problem Sets
Problem set #1: PDF | two moons dataset -- due Monday, March 15th
Problem set #2: PDF -- due Wednesday, April 14th UPDATED 4/6

Projects
Project ideas: PDF
Abstract Proposals due Monday, March 29th

Syllabus

Follow the link for each class to find a detailed description, suggested readings, and class slides. Some of the later classes may be subject to reordering or rescheduling.

Date Title Instructor(s) Scribe notes

Class 01 Wed 03 Feb The Course at a Glance TP ---

Class 02 Mon 08 Feb The Learning Problem and Regularization TP ---

Class 03 Wed 10 Feb Reproducing Kernel Hilbert Spaces LR PDF

Mon 15 Feb - President's Day

Class 04 Tue 16 Feb Regularized Least Squares CF PDF

Class 05 Wed 17 Feb Several Views Of Support Vector Machines CF PDF

Class 06 Mon 22 Feb Generalization Bounds, Intro to Stability LR/TP PDF

Class 07 Wed 24 Feb Stability of Tikhonov Regularization LR/TP ---

Class 08 Mon 01 Mar Spectral Regularization LR ---

Class 09 Wed 03 Mar Manifold Regularization LR PDF

Class 10 Mon 08 Mar Sparsity Based Regularization LR PDF

Class 11 Wed 10 Mar Regularization Methods for Multi-Output Learning LR PDF

Class 12 Mon 15 Mar Regularization Methods for Online Learning Sasha Rakhlin ---

Class 13 Wed 17 Mar Loose ends, Project discussions TP/LR ---

SPRING BREAK March 19-25

Class 14 Mon 29 Mar Vision and Visual Neuroscience TP ---

Class 15 Wed 31 Mar Vision and Visual Neuroscience (HMAX) TP ---

Class 16 Mon 05 Apr Derived Kernels LR ---

Class 17 Wed 07 Apr Application of Belief Nets to Modelling Attention Sharat Chikkerur ---

Class 18 Mon 12 Apr Active Learning: Closing the loop between data analysis and aquisition Rui Castro ---

Class 19 Wed 14 Apr Deep Boltzman Machines Ruslan Salakhtudinov ---

Mon 19 Apr - Patriot's Day

Class 20 Wed 21 Apr Bayesian Interpretations of Regularization CF ---

Class 21 Mon 26 Apr Bayesian Learning and Nonparametric Bayesian Methods LR ---

Class 22 Wed 28 Apr Approximate Inference Ruslan Salakhtudinov ---

Class 23 Mon 03 May Multiple Kernel Learning LR ---

Class 24 Wed 05 May Geometry and Analysis of Data Sets in High Dimension Mauro Maggioni ---

Class 25 Mon 10 May Project Presentations --- ---

Class 26 Wed 12 May Project Presentations --- ---

Math Camp Mon 08 Feb Functional analysis: slides, notes. Probability theory: notes --- ---

Reading List
There is no textbook for this course. All the required information will be presented in the slides associated with each class. The books/papers listed below are useful general reference reading, especially from the theoretical viewpoint. A list of suggested readings will also be provided separately for each class.
Primary References

Bousquet, O., S. Boucheron and G. Lugosi. Introduction to Statistical Learning Theory. Advanced Lectures on Machine Learning Lecture Notes in Artificial Intelligence 3176, 169-207. (Eds.) Bousquet, O., U. von Luxburg and G. Ratsch, Springer, Heidelberg, Germany (2004)

F. Cucker and S. Smale. On The Mathematical Foundations of Learning. Bulletin of the American Mathematical Society, 2002.

F. Cucker and D-X. Zhou. Learning theory: an approximation theory viewpoint. Cambridge Monographs on Applied and Computational Mathematics. Cambridge University Press, Cambridge, 2007.

L. Devroye, L. Gyorfi, and G. Lugosi. A Probabilistic Theory of Pattern Recognition. Springer, 1997.

T. Evgeniou and M. Pontil and T. Poggio. Regularization Networks and Support Vector Machines. Advances in Computational Mathematics, 2000.

T. Poggio and S. Smale. The Mathematics of Learning: Dealing with Data. Notices of the AMS, 2003

I. Steinwart and A. Christmann. Support vector machines. Springer, New York, 2008.

V. N. Vapnik. Statistical Learning Theory. Wiley, 1998.

V. N. Vapnik. The Nature of Statistical Learning Theory. Springer, 1995.

Secondary References

O. Bousquet and A. Elisseeff, Stability and Generalization, Journal of Machine Learning Research, Vol. 2, pp.499-526, 2002.

N. Cristianini and J. Shawe-Taylor. Introduction To Support Vector Machines. Cambridge, 2000.

Lo Gerfo L., Rosasco L., Odone F., De Vito E. and Verri, A. Spectral Algorithms for Supervised Learning, to appear in Neural Computation

Poggio, T., R. Rifkin, S. Mukherjee and P. Niyogi. General Conditions for Predictivity in Learning Theory, Nature, Vol. 428, 419-422, 2004 (see also Past Performance and Future Results).
Rifkin, R.,. and R.A. Lippert. Notes on Regularized Least-Squares, CBCL Paper #268/AI Technical Report #2007-019, Massachusetts Institute of Technology, Cambridge, MA, May, 2007.

Rifkin, R. and A. Klautau. In Defense of One-vs-All Classification, Journal of Machine Learning Research, Vol. 5, 101-141, 2004.

Background Mathematics References

A. N. Kolmogorov and S. V. Fomin, Introductory Real Analysis, Dover Publications, 1975.

A. N. Kolmogorov and S. V. Fomin, Elements of the Theory of Functions and Functional Analysis, Dover Publications, 1999.

Luenberger, Optimization by Vector Space Methods, Wiley, 1969.

Neuroscience Related References

Serre, T., L. Wolf, S. Bileschi, M. Riesenhuber and T. Poggio. "Object Recognition with Cortex-like Mechanisms", IEEE Transactions on Pattern Analysis and Machine Intelligence, 29, 3, 411-426, 2007.

Serre, T., A. Oliva and T. Poggio."A Feedforward Architecture Accounts for Rapid Categorization", Proceedings of the National Academy of Sciences (PNAS), Vol. 104, No. 15, 6424-6429, 2007.

S. Smale, L. Rosasco, J. Bouvrie, A. Caponnetto, and T. Poggio. "Mathematics of the Neural Response", CBCL Paper #276/MIT CSAIL Technical Report #TR2008-070, Massachusetts Institute of Technology, Cambridge, MA, November, 2008

Class Times:	Monday and Wednesday 10:30-12:00
Units:	3-0-9 H,G
Location:	46-5193
Instructors:	Tomaso Poggio (TP), Lorenzo Rosasco (LR), Charlie Frogner (CF)
TA:	Andre Wibisono
Office Hours:	Friday 1-2 pm in 46-5165
Email Contact :	9.520@mit.edu
Previous Class:	SPRING 09

	Date	Title	Instructor(s)	Scribe notes
Class 01	Wed 03 Feb	The Course at a Glance	TP	---
Class 02	Mon 08 Feb	The Learning Problem and Regularization	TP	---
Class 03	Wed 10 Feb	Reproducing Kernel Hilbert Spaces	LR	PDF
Mon 15 Feb - President's Day
Class 04	Tue 16 Feb	Regularized Least Squares	CF	PDF
Class 05	Wed 17 Feb	Several Views Of Support Vector Machines	CF	PDF
Class 06	Mon 22 Feb	Generalization Bounds, Intro to Stability	LR/TP	PDF
Class 07	Wed 24 Feb	Stability of Tikhonov Regularization	LR/TP	---
Class 08	Mon 01 Mar	Spectral Regularization	LR	---
Class 09	Wed 03 Mar	Manifold Regularization	LR	PDF
Class 10	Mon 08 Mar	Sparsity Based Regularization	LR	PDF
Class 11	Wed 10 Mar	Regularization Methods for Multi-Output Learning	LR	PDF
Class 12	Mon 15 Mar	Regularization Methods for Online Learning	Sasha Rakhlin	---
Class 13	Wed 17 Mar	Loose ends, Project discussions	TP/LR	---
SPRING BREAK March 19-25
Class 14	Mon 29 Mar	Vision and Visual Neuroscience	TP	---
Class 15	Wed 31 Mar	Vision and Visual Neuroscience (HMAX)	TP	---
Class 16	Mon 05 Apr	Derived Kernels	LR	---
Class 17	Wed 07 Apr	Application of Belief Nets to Modelling Attention	Sharat Chikkerur	---
Class 18	Mon 12 Apr	Active Learning: Closing the loop between data analysis and aquisition	Rui Castro	---
Class 19	Wed 14 Apr	Deep Boltzman Machines	Ruslan Salakhtudinov	---
Mon 19 Apr - Patriot's Day
Class 20	Wed 21 Apr	Bayesian Interpretations of Regularization	CF	---
Class 21	Mon 26 Apr	Bayesian Learning and Nonparametric Bayesian Methods	LR	---
Class 22	Wed 28 Apr	Approximate Inference	Ruslan Salakhtudinov	---
Class 23	Mon 03 May	Multiple Kernel Learning	LR	---
Class 24	Wed 05 May	Geometry and Analysis of Data Sets in High Dimension	Mauro Maggioni	---
Class 25	Mon 10 May	Project Presentations	---	---
Class 26	Wed 12 May	Project Presentations	---	---

Math Camp	Mon 08 Feb	Functional analysis: slides, notes. Probability theory: notes	---	---