Class Times:Monday and Wednesday 10:30-12:00 Units:3-0-9 H,G Location:46-5193 Instructors:Tomaso Poggio (TP), Lorenzo Rosasco (LR), Charlie Frogner (CF), Guille D. Canas (GJ)

Office Hours:Friday 1-2 pm in 46-5156, CBCL lounge (by appointment) Email Contact :9.520@mit.eduPrevious Class: SPRING 11

Course descriptionThe class introduces the theory and algorithms of computational learning in the framework of statistics and functional analysis. It gives an in-depth discussion of state of the art machine learning algorithms, for regression and classification, variable selection, manifold learning and transfer learning. The class focuses on the unifying role of regularization.

Many problems in applied science are inverse problems and most inverse problems are ill-posed: the solution does not satisfy the basic requirement of existence, uniqueness and stability. As it turns out most sensory problems are inverse and ill-posed problems. In a sense, intelligence is the ability of solving effectively inverse problems. Probably the most interesting inverse and ill-posed problem -- and the one which is at the very core of intelligence -- is the problem of learning from experience.

The theory and algorithms of regularization provide principled ways to solve ill-posed problems and restore well-posedness. Not surprisingly, most of the successful machine learning algorithms, such as MobilEye's vision system for cars and new system for intelligent assistants, are based on regularization techniques.

The goal of this class is to provide students with the knowledge needed to use and develop effective computational learning solutions to challenging problems.## We will make extensive use of linear algebra, basic functional analysis (we cover the essentials in class and during the math-camp), basic concepts in probability theory and concentration of measure (also covered in class and during the mathcamp). Students are expected to be familiar with Matlab.

Prerequisites## Requirements for grading (other than attending lectures) are: scribing one lecture, 2 problems sets, and a final project.

Grading

Scribe NotesIn this class, there will be three to five unscribed lectures; of the remaining lectures, new scribe notes for classes #5,6,9,11 will be created, while those of lectures #2 - #8, #10, #12, and lectures #14 - #18 will be edited from existing notes. Each student taking the class for credit will be required to work on improving and updating, or creating the scribe notes of one lecture. Scribe notes should be a natural integration of the presentation of the lectures with the material in the slides. The lecture slides are available on this website for your reference. Good scribe notes are important both for your grades, and for other students to read. In particular, please make an effort to present the material in a clear, concise, and comprehensive manner.

Scribe notes must be prepared with Latex, using the provided template. Scribe notes (.tex file and all additional files) should be submitted to 9.520@mit.edu no later than one week after the class. Please make sure to proofread the notes carefully before submitting. We will review the scribe notes to check the technical content and quality of writing. We will also give feedback and ask for a revised version if necessary. Completed scribe notes will be posted on this website as soon as possible.

You can sign up here to scribe a lecture. If you have problems opening or editing the page, please send us an email at 9.520@mit.edu. In addition, if you have any questions of concerns about the scribing requirement, please feel free to send us an email.

## Problem set #1: due Monday March 19st. Data

Problem Sets

Problem set #2: due 4-25-2012. Data

## Project abstract submission due Monday April 2nd.

Projects

Final project due Friday May 18st.

The final project can be either a wikipedia entry or a research project (we recommend a Wikipedia entry).

We envision 2 kinds of research project:For the Wikipedia article, we suggest a short one using the Wikipedia standard article format; for the research project you should use this template. Reports should be 8 pages maximum, including references. Additional material can be included in the appendix.

Applications: evaluate an algorithm on some interesting problem of your choice;Theory and Algorithms: study theoretically or empirically some new machine learning algorithm/problem.

Updated project suggestions: Spring 2012 projects## Syllabus

Follow the link for each class to find a detailed description, suggested readings, and class slides. Some of the later classes may be subject to reordering or rescheduling.

Parameter

Date Title Instructor(s) Class 01 Wed 08 Feb The Course at a Glance TP Class 02 Mon 13 Feb The Learning Problem and Regularization TP Class 03 Wed 15 Feb Reproducing Kernel Hilbert Spaces LR Mon 20 Feb - President's Day Class 04 Tues 21 Feb Mercer Theorem and Feature Maps LR Class 05 Wed 22 Feb Tikhonov Regularization and the Representer Theorem LR Class 06 Mon 27 Feb Regularized Least Squares and Support Vector Machines LR Class 07 Wed 29 Feb Generalization Bounds, Intro to Stability LR Class 08 Mon 05 Mar Stability of Tikhonov Regularization LR Class 09 Wed 07 Mar Regularization Parameter Choice: Theory and Practice LR Class 10 Mon 12 Mar Bayesian Interpretations of Regularization CF Class 11 Wed 14 Mar Nonparametric Bayesian methods CF Class 12 Mon 19 Mar Spectral Regularization LR Class 13 Wed 21 Mar Loose ends, Project discussions Mon 26 - Fri 30 Mar - Spring break Class 14 Mon 02 Apr Manifold Regularization LR Class 15 Wed 04 Apr Sparsity Based Regularization LR Class 16 Mon 09 Apr Regularization with Multiple Kernels LR Class 17 Wed 11 Apr Regularization for Multi-Output Learning LR Mon 16 Apr - Patriot's day Class 18 Wed 18 Apr On-line Learning LR Class 19 Mon 23 Apr Hierarchical Representation for Learning: Visual Cortex TP Class 20 Wed 25 Apr Hierarchical Representation for Learning: Mathematics LR Class 21 Mon 30 Apr Hierarchical Representation for Learning: Computational Model TP Class 22 Wed 02 May Learning Data Representation with Regularization GC/LR Class 23 Mon 07 May TBA Class 24 Wed 09 May Machine Learning for Humanoid Robotics Giorgio Metta - IIT Class 25 Mon 14 May Project Presentations Class 26 Wed 16 May Project Presentations

Math Camp Mon 13 Feb (7-9pm) Functional analysis: slides, notes. Probability theory: notes Old Math Camp Slides XX Functional analysis Old Math Camp Slides XX Probability theory ## There is no textbook for this course. All the required information will be presented in the slides associated with each class. The books/papers listed below are useful general reference reading, especially from the theoretical viewpoint. A list of suggested readings will also be provided separately for each class.

Reading List## Primary References

- Bousquet, O., S. Boucheron and G. Lugosi. Introduction to Statistical Learning Theory. Advanced Lectures on Machine Learning Lecture Notes in Artificial Intelligence 3176, 169-207. (Eds.) Bousquet, O., U. von Luxburg and G. Ratsch, Springer, Heidelberg, Germany (2004)
- F. Cucker and S. Smale.
On The Mathematical Foundations of Learning.Bulletin of the American Mathematical Society, 2002.- F. Cucker and D-X. Zhou.
Learning theory: an approximation theory viewpoint.Cambridge Monographs on Applied and Computational Mathematics. Cambridge University Press, Cambridge, 2007.- L. Devroye, L. Gyorfi, and G. Lugosi.
A Probabilistic Theory of Pattern Recognition.Springer, 1997.- T. Evgeniou and M. Pontil and T. Poggio.
Regularization Networks and Support Vector Machines.Advances in Computational Mathematics, 2000.- T. Poggio and S. Smale.
The Mathematics of Learning: Dealing with Data.Notices of the AMS, 2003- I. Steinwart and A. Christmann.
Support vector machines.Springer, New York, 2008.- V. N. Vapnik.
Statistical Learning Theory.Wiley, 1998.- V. N. Vapnik.
The Nature of Statistical Learning Theory.Springer, 1995.- N. Cristianini and J. Shawe-Taylor.
Introduction To Support Vector Machines.Cambridge, 2000.## Background Mathematics References

- A. N. Kolmogorov and S. V. Fomin, Introductory Real Analysis, Dover Publications, 1975.
- A. N. Kolmogorov and S. V. Fomin, Elements of the Theory of Functions and Functional Analysis, Dover Publications, 1999.
- Luenberger, Optimization by Vector Space Methods, Wiley, 1969.
## Neuroscience Related References

- Serre, T., L. Wolf, S. Bileschi, M. Riesenhuber and T. Poggio. "Object Recognition with Cortex-like Mechanisms", IEEE Transactions on Pattern Analysis and Machine Intelligence, 29, 3, 411-426, 2007.
- Serre, T., A. Oliva and T. Poggio."A Feedforward Architecture Accounts for Rapid Categorization", Proceedings of the National Academy of Sciences (PNAS), Vol. 104, No. 15, 6424-6429, 2007.
- S. Smale, L. Rosasco, J. Bouvrie, A. Caponnetto, and T. Poggio. "Mathematics of the Neural Response", Foundations of Computational Mathematics, Vol. 10, 1, 67-91, June 2009.