We first analyze the limits of learning in high dimension. Hence, we stress the difference between high dimensional ambient space and intrinsic geometry associated to the marginal distribution. We observe that, in the semi-supervised setting, unlabeled data could be used to exploit low dimensionality of the intrinsic geometry. In order to formalize these intuitions we briefly introduce the manifold Laplacian and Graph Laplacian. Finally, we introduce a new class of regularization algorithms, aimed at enforcing smoothness relative to the intrinsic geometry.
Slides for this lecture: PDF.